#StackBounty: #python #python-3.x #docker #docker-compose Python multiprocessing crashes docker container

Bounty: 100

There is simple python multiprocessing code that works like a charm, when I run it in console:

# mp.py
import multiprocessing as mp


def do_smth():
    print('something')


if __name__ == '__main__':
    ctx = mp.get_context("spawn")
    p = ctx.Process(target=do_smth, args=tuple())
    p.start()
    p.join()

Result:

> $ python3 mp.py
something

Then I’ve created a simple Docker container with Dockerfile:

FROM python:3.6

ADD . /app
WORKDIR /app

And docker-compose.yml:

version: '3.6'

services:
  bug:
    build:
      context: .
    environment:
      - PYTHONUNBUFFERED=1
    command: su -c "python3.6 forever.py"

Where forever.py is:

from time import sleep

if __name__ == '__main__':
    i = 0
    while True:
        sleep(1.0)
        i += 1
        print(f'hello {i:3}')

Now I run forever.py with docker compose:

> $ docker-compose build && docker-compose up 
...
some output
...
Attaching to mpbug_bug_1
bug_1  | hello   1
bug_1  | hello   2
bug_1  | hello   3
bug_1  | hello   4

Up to this moment everything is good and understandable. But when I’m trying to run mp.py in the docker container it crashes without any message:

> $ docker exec -it mpbug_bug_1 /bin/bash
root@09779ec47f9d:/app# python mp.py 
something
root@09779ec47f9d:/app# % 

Gist with the code can be found here: https://gist.github.com/ilalex/83649bf21ef50cb74a2df5db01686f18

Can you explain why docker container is crashed and how to do it without crashing?

Thank you in advance!


Get this bounty!!!

#StackBounty: #python #json #python-3.x #csv #parsing How to read and map CSV's multi line header rows using python

Bounty: 50

I have a CSV file which is downloaded from database(as it is in CSV) and now I have to parse into JSON Schema. Don’t worry this link just github gist

enter image description here

Problem I am facing is: its Multi line Header check CSV File Here

If you take notice in the file:

  1. On 1st line of CSV it has 1st line of headers then next line has
    all the values for those headers.

  2. On 3rd line of CSV file it has 2nd line of headers then next line
    has all the values for those headers.

  3. impOn 5th line of CSV file it has 3rd line of headers then next line
    has all the values for those headers.

Also you can notice the pattern here,

  • 1st line of headers hasn’t any tab
  • 2nd line of headers has only one tab
  • 3rd line of headers has two tabs

This goes for all the records.

Now 1st problem is this multi line of headers.
And 2nd problem is how to parse it into nested json as I have.
one of the solution I have tried Create nested JSON from CSV. and noticed the 1st problem with my csv.

Its been 2 days I am on this, still didn’t got any kind of solution at my end.

My look like this. Where I am only trying to parse initial fields of schema

import csv
import json


def csvParse(csvfile):
    # Open the CSV
    f = open(csvfile, 'r')
    # Change each fieldname to the appropriate field name.
    reader = csv.DictReader(f, fieldnames=("Order Ref", "Order 
Status", "Affiliate", "Source", "Agent", "Customer Name", "Customer Name", "Email 
Address", "Telephone", "Mobile", "Address 1", "Address 2", "City", "County/State",
"Postal Code", "Country", "Voucher Code", " Voucher Amount", "Order Date", "Item ID", 
"Type", "Supplier Code", "Supplier Name", "Booking Ref", "Supplier Price", "Currency", "Selling Price", "Currency", "Depart", "Arrive", "Origin", 
"Destination", "Carrier", "Flight No", "Class", "Pax Type", "Title", 
"Firstname", "Surname", "DOB", "Gender", "FOID Type"))

customer = []
data = []
# data frame names in a list
for row in reader:
    frame = {"orderRef": row["Order Ref"],
             "orderStatus": row["Order Status"],
             "affiliate": row["Affiliate"],
             "source": row["Source"],
             "customers": []}

    data.append(frame)

Json Schema

{


orderRef: number,
  orderStatus: string,
  affiliate: string,
  source: string,
  agent: string,
  customer: {
    name: string,
    email: string,
    telephone: string
    mobile: string,
    address: {
      address1: string,
      address2: string,
      city: string,
      country: string,
      postcode: string,
      country: stringdob
    },
  },
  voucherCode: string,
  voucherAmount: number,
  orderDate: date,
  items:[
    {
      itemId: number,
      type: string,
      supplierCode: string,
      supplierName: string,
      bookingReference: string,
      supplierPrice: float,
      supplierPriceCurrency: string,
      sellingPrice: float,
      sellingPriceCurrency: string,
      legs: [
        {
          departureDate: datetime,
          arrivalDate: datetime, // can be null of not available
          origin: string,
          destination: string,
          carrier: string,
          flightNumber: string,
          class: string
        }
      ],
      passengers: [
        {
          passengerType: string,
          title: string,
          firstName: string,
          surName: string,
          dob: string,
          gender: string,
          foidType: string
        }
      ]
    }
  ]

}


Get this bounty!!!

#StackBounty: #python #sql #django #python-3.x #orm How do I use timedelta with a column in my Django ORM query?

Bounty: 150

I’m using Django and Python 3.7. I have the below two models …

class Article(models.Model):
    ...
    publisher = models.ForeignKey(Publisher, on_delete=models.CASCADE, related_name="articles",)
    created_on = models.DateTimeField(default=datetime.now)

class WebPageStat(models.Model):
    ...
    publisher = models.ForeignKey(Publisher, on_delete=models.CASCADE, related_name="stats", )

    elapsed_time_in_seconds = models.FloatField(default=0)
    score = models.BigIntegerField(default=0)

class Publisher(models.Model):
   name = models.CharField(max_length=100)

   def __str__(self):
       return self.name

I want to write a Django ORM query where given a publisher and an elapsed time in seconds (a WebPageStat record), I find all articles whose “created_on” date is not older than the elapsed time in seconds. Many have suggested using “timedelta,” in other posts, but that doesn’t seem to be working for me here …

Article.objects.filter(created_on__lte=datetime.now(timezone.utc) - timedelta(hours=0, minutes=0, seconds=publisher__stats__elapsed_time_in_seconds))
Traceback (most recent call last):
  File "<input>", line 1, in <module>
NameError: name 'publisher__stats__elapsed_time_in_seconds' is not defined

Can I use timedelta with SQL column logic? Otherwise how do I do this?


Get this bounty!!!