#StackBounty: #python #json #python-3.x #csv #parsing How to read and map CSV's multi line header rows using python

Bounty: 50

I have a CSV file which is downloaded from database(as it is in CSV) and now I have to parse into JSON Schema. Don’t worry this link just github gist

enter image description here

Problem I am facing is: its Multi line Header check CSV File Here

If you take notice in the file:

  1. On 1st line of CSV it has 1st line of headers then next line has
    all the values for those headers.

  2. On 3rd line of CSV file it has 2nd line of headers then next line
    has all the values for those headers.

  3. impOn 5th line of CSV file it has 3rd line of headers then next line
    has all the values for those headers.

Also you can notice the pattern here,

  • 1st line of headers hasn’t any tab
  • 2nd line of headers has only one tab
  • 3rd line of headers has two tabs

This goes for all the records.

Now 1st problem is this multi line of headers.
And 2nd problem is how to parse it into nested json as I have.
one of the solution I have tried Create nested JSON from CSV. and noticed the 1st problem with my csv.

Its been 2 days I am on this, still didn’t got any kind of solution at my end.

My look like this. Where I am only trying to parse initial fields of schema

import csv
import json

def csvParse(csvfile):
    # Open the CSV
    f = open(csvfile, 'r')
    # Change each fieldname to the appropriate field name.
    reader = csv.DictReader(f, fieldnames=("Order Ref", "Order 
Status", "Affiliate", "Source", "Agent", "Customer Name", "Customer Name", "Email 
Address", "Telephone", "Mobile", "Address 1", "Address 2", "City", "County/State",
"Postal Code", "Country", "Voucher Code", " Voucher Amount", "Order Date", "Item ID", 
"Type", "Supplier Code", "Supplier Name", "Booking Ref", "Supplier Price", "Currency", "Selling Price", "Currency", "Depart", "Arrive", "Origin", 
"Destination", "Carrier", "Flight No", "Class", "Pax Type", "Title", 
"Firstname", "Surname", "DOB", "Gender", "FOID Type"))

customer = []
data = []
# data frame names in a list
for row in reader:
    frame = {"orderRef": row["Order Ref"],
             "orderStatus": row["Order Status"],
             "affiliate": row["Affiliate"],
             "source": row["Source"],
             "customers": []}


Json Schema


orderRef: number,
  orderStatus: string,
  affiliate: string,
  source: string,
  agent: string,
  customer: {
    name: string,
    email: string,
    telephone: string
    mobile: string,
    address: {
      address1: string,
      address2: string,
      city: string,
      country: string,
      postcode: string,
      country: stringdob
  voucherCode: string,
  voucherAmount: number,
  orderDate: date,
      itemId: number,
      type: string,
      supplierCode: string,
      supplierName: string,
      bookingReference: string,
      supplierPrice: float,
      supplierPriceCurrency: string,
      sellingPrice: float,
      sellingPriceCurrency: string,
      legs: [
          departureDate: datetime,
          arrivalDate: datetime, // can be null of not available
          origin: string,
          destination: string,
          carrier: string,
          flightNumber: string,
          class: string
      passengers: [
          passengerType: string,
          title: string,
          firstName: string,
          surName: string,
          dob: string,
          gender: string,
          foidType: string


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.