#StackBounty: #python #pandas #aws-lambda AWS Lambda – read csv and convert to pandas dataframe

Bounty: 100

I have got a simple Lambda code to read the csv file from S3 Bucket. All is working fine however I tried to get the csv data to pandas data frame and the error comes up string indices must be integers

My code is bog-standard but I just need to use the csv as a data frame for further manipulation. The hashed line is the source of the error. I can print data with no problems so the bucket and file details are configured properly.

updated code

import json
import pandas as pd
import numpy as np
import requests
import glob
import time
import os
from datetime import datetime
from csv import reader
import boto3
import traceback
import io

s3_client = boto3.client('s3')

def lambda_handler(event, context):
    try:
            
        bucket_name = event["Records"][0]["s3"]["bucket"]["name"]
        s3_file_name = event["Records"][0]["s3"]["object"]["key"]
        resp = s3_client.get_object(Bucket=bucket_name, Key=s3_file_name)
        
        data = resp['Body'].read().decode('utf-8')
        df=pd.DataFrame( list(reader(data)))
        print (df.head())

    except Exception as err:
        print(err)
        

        
        
    # TODO implement
    return {
        'statusCode': 200,
        'body': json.dumps('Hello fr2om Lambda!')
    }
    
    traceback.print_exc()


Get this bounty!!!

#StackBounty: #python #aws-lambda #amazon-dynamodb #boto3 #serverless How to create the dynamodb table using serverless.yml and delete …

Bounty: 50

I’ve created the dynamodb table using serverless.yml as below:

resources:
  Resources:
    myTable:
      Type: AWS::DynamoDB::Table
      DeletionPolicy: Retain
      Properties:
        TableName: myTable
        AttributeDefinitions:
          - AttributeName: id
            AttributeType: S
          - AttributeName: firstname
            AttributeType: S
          - AttributeName: lastname
            AttributeType: S
        KeySchema:
          - AttributeName: id
            KeyType: HASH
          - AttributeName: firstname
            KeyType: RANGE
        BillingMode: PAY_PER_REQUEST
        SSESpecification:
          SSEEnabled: true

But I’ve got this issue:

An error occurred: myTable – One or more parameter values were
invalid: Number of attributes in KeySchema does not exactly match
number of attributes defined in AttributeDefinitions (Service:
AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException;
Request ID: PEI9OT7E72HQN4N5MQUOIUQ18JVV4KQNSO5AEMVJF66Q9ASUAAJG;
Proxy: null).

Could you help me creating the dynamodb table using serverless.yml?
And how can I delete the items that first name is "First" in this table using python boto3?


Get this bounty!!!

#StackBounty: #javascript #json #google-apps-script #google-sheets #aws-lambda How to batch row data and send a single JSON payload?

Bounty: 100

I currently use a Google Apps Script on a Google Sheet, that sends individual row data to AWS API Gateway to generate a screenshot. At the moment, multiple single JSON payload requests are causing some Lambda function failures. So I want to batch the row data and then send as a single payload, so a single AWS Lambda function can then perform and complete multiple screenshots.

How can I batch the JSON payload after iterating the data on each line in the code below?

function S3payload () {
  var PAYLOAD_SENT = "S3 SCREENSHOT DATA SENT";
  
  var sheet = SpreadsheetApp.getActiveSheet(); // Use data from the active sheet
  
  // Add temporary column header for Payload Status new column entries
  sheet.getRange('E1').activate();
  sheet.getCurrentCell().setValue('payload status');
  
  var startRow = 2;                            // First row of data to process
  var numRows = sheet.getLastRow() - 1;        // Number of rows to process
  var lastColumn = sheet.getLastColumn();      // Last column
  var dataRange = sheet.getRange(startRow, 1, numRows, lastColumn) // Fetch the data range of the active sheet
  var data = dataRange.getValues();            // Fetch values for each row in the range
  
  // Work through each row in the spreadsheet
  for (var i = 0; i < data.length; ++i) {
    var row = data[i];  
    // Assign each row a variable   
    var index = row[0];     // Col A: Index Sequence Number
    var img = row[1];   // Col B: Image Row
    var url = row[2];      // Col C: URL Row
    var payloadStatus = row[lastColumn - 1];  // Col E: Payload Status (has the payload been sent)
  
    var siteOwner = "email@example.com";
    
    // Prevent from sending payload duplicates
    if (payloadStatus !== PAYLOAD_SENT) {  
        
      /* Forward the Contact Form submission to the owner of the site
      var emailAddress = siteOwner; 
      var subject = "New contact form submission: " + name;
      var message = message;*/
      
      //Send payload body to AWS API GATEWAY
      //var sheetid = SpreadsheetApp.getActiveSpreadsheet().getId(); // get the actual id
      //var companyname = SpreadsheetApp.getActiveSpreadsheet().getName(); // get the name of the sheet (companyname)
      
      var payload = {
        "img": img,
        "url": url
      };
      
      var url = 'https://hfcrequestbin.herokuapp.com/vbxpsavb';
      var options = {
        'method': 'post',
        'payload': JSON.stringify(payload)
      };
      
      var response = UrlFetchApp.fetch(url,options);
      
      sheet.getRange(startRow + i, lastColumn).setValue(PAYLOAD_SENT); // Update the last column with "PAYLOAD_SENT"
      SpreadsheetApp.flush(); // Make sure the last cell is updated right away
      
      // Remove temporary column header for Payload Status    
      sheet.getRange('E1').activate();
      sheet.getCurrentCell().clear({contentsOnly: true, skipFilteredRows: true});
      
    }
  }
}

Example individual JSON payload

{"img":"https://s3screenshotbucket.s3.amazonaws.com/realitymine.com.png","url":"https://realitymine.com"}

enter image description here

Example desired output result

[
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/gavurin.com.png","url":"https://gavurin.com"},
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/google.com.png","url":"https://google.com"},
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/amazon.com","url":"https://www.amazon.com"},  
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/stackoverflow.com","url":"https://stackoverflow.com"},
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/duckduckgo.com","url":"https://duckduckgo.com"},
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/docs.aws.amazon.com","url":"https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-features.html"},  
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/github.com","url":"https://github.com"},  
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/github.com/shelfio/chrome-aws-lambda-layer","url":"https://github.com/shelfio/chrome-aws-lambda-layer"},  
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/gwww.youtube.com","url":"https://www.youtube.com"},   
    {"img":"https://s3screenshotbucket-useast1v5.s3.amazonaws.com/w3docs.com","url":"https://www.w3docs.com"}       
]


Get this bounty!!!

#StackBounty: #ffmpeg #aws-lambda Am I missing a timeout param in FFMPEG?

Bounty: 100

I’m running an ffmpeg command like this:

ffmpeg -loglevel quiet -report -timelimit 15 -timeout 10 -protocol_whitelist file,http,https,tcp,tls,crypto -i ${inputFile} -vframes 1 ${outputFile} -y

This is running in an AWS Lambda function. My Lambda timeout is at 30 seconds. For some reason I am getting “Task timed out” messages still. I should note I log before and after the command, so I know it’s timing out during this task.

Update

In terms of the entire lambda execution I do the following:

  • Invoke a lambda to get an access token. This lambda makes on API request. It has a timeout of 5 seconds. The max time was 660MS for one request.

  • Make another API request to verify data. The max time was 1.6 seconds.

  • Run FFMPEG

timelimit is supposed to Exit after ffmpeg has been running for duration seconds in CPU user time.. Theoretically this shouldn’t run more than 15 seconds then, plus maybe 2-3 more before the other requests.

timeout is probably superfluous here. There were a lot of definitions for it in the manual, but I think that was mainly waiting on input? Either way, I’d think timelimit would cover my bases.

Update 2

I checked my debug log and saw this:

Reading option '-timelimit' ... matched as option 'timelimit' (set max runtime in seconds) with argument '15'.
Reading option '-timeout' ... matched as AVOption 'timeout' with argument '10'.

Seems both options are supported by my build

Update 2

I have updated my code with a lot of logs. I definitively see the FFMPEG command as the last thing that executes, before stalling out for the 30 second timeout

Update 3
I can reproduce the behavior by pointing at a track instead of full manifest. I have set the command to this:

ffmpeg -loglevel debug -timelimit 5 -timeout 5 -i 'https://streamprod-eastus-streamprodeastus-usea.streaming.media.azure.net/0c495135-95fa-48ec-a258-4ba40262e1be/23ab167b-9fec-439e-b447-d355ff5705df.ism/QualityLevels(200000)/Manifest(video,format=m3u8-aapl)' -vframes 1 temp.jpg -y

A few things here:

  1. I typically point at the actual manifest (not the track), and things usually run much faster
  2. I have lowered the timelimit and timeout to 5. Despite this, when i run a timer, the command runs for ~15 seconds every time. It outputs a bunch of errors, likely due to this being track rather than full manifest, and then spits out the desired image.

The full output is at https://gist.github.com/DaveStein/b3803f925d64dd96cd45ae9db5e5a4d0


Get this bounty!!!

#StackBounty: #redirect #amazon-s3 #aws-lambda #amazon-cloudfront #aws-lambda-edge Cloudfront, lambda @ edge, S3 redirect objects,

Bounty: 50

I am building a S3 URL redirect, nothing special just a bunch of zero length objects with the WebsiteRedirectLocation meta filled out. The S3 bucket is set to server static websites, bucket policy set to public ect. It works just fine.

HOWEVER – I also want to lock down certain files in the bucket – specifically some html files that serve to manage the redirects (like adding new redirects). With the traditional set up, I can both use the redirects, and also serve the html page just fine. But in order to lock it down, I need to use Cloudfront and Lambda@edge like in these posts:

https://douglasduhaime.com/posts/s3-lambda-auth.html

http://kynatro.com/blog/2018/01/03/a-step-by-step-guide-to-creating-a-password-protected-s3-bucket/

I have modified the lambda@edge script to only prompt for a password IF the admin page (or its assets like CSS/JS) are requested. If the requested path is something else (presumably a redirect file) the user is not prompted for a password. And yes, I could also set a behavior rule in Cloudfront to decide when to use the Lambda function to prompt for a password.

And it works kind of. When I follow these instructions and visit my site via the Cloudfront URL, I do indeed get prompted for a password when I goto the root of my site – the admin page. However, the redirects will not work. If I try to load a redirect the browser just downloads it instead.

Now, in another post someone suggested that I change my Cloudfront distribution endpoint to the S3 bucket WEBSITE endpoint – which I think also means changing the bucket policy back to website mode and public which sucks because now its accessible outside of the Cloudfront policy which I do not want. Additionally – Cloudfront no longer automatically serves the specified index file, which isnt the worst thing.

SO – is it possible to lock down my bucket, server it entirely through Cloudfront with Lambda@edge BUT also have Cloudfront respect those redirects instead of just prompting a download? Is there a setting in Cloudfront to respect the headers? Should I set up different behavior rules for the different files (html vs redirects)?


Get this bounty!!!

#StackBounty: #man-in-the-middle #request-signing #aws-cognito #aws-lambda #mobile-app What is the use case of request signing in this …

Bounty: 150

The API of a mobile app I was testing is sending the AWS AccessKeyId and SecretKey used for request signing from the AWS Cognito server unencrypted (apart from the regular TLS encryption). Making it possible to re-sign all requests to their AWS Lambda API, e.g. using Burp’s “AWS Signer” extension.

With this, a Man-In-The-Middle could sign all altered requests, so I wonder what the actual use case of request signing is, in this instance?

Shouldn’t the AccessKeyID and SecretKey be kept secret?

The owner of the app is telling me that this is not an issue because they are following the AWS guidelines.

Is that correct? Or are they doing something wrong?

Why would they sign the requests in the first place in their mobile app?
What is the use case of signing the requests, when the ‘secrets’ for creating a signature are distributed via the same connection in clear (except TLS)?

Is this conform with best practices, when using AWS Lambda for serverless mobile app APIs? Is request signing even useful in this instance? Most apps I have tested didn’t use request signing.


Get this bounty!!!

#StackBounty: #python-3.x #amazon-s3 #aws-lambda Extract specific column from csv file, and convert to string in Lambda using Python

Bounty: 50

I’m trying to get all email addresses in a comma separated format from a specific column. This is coming from a csv temp file in Lambda. My goal is to save that file in s3 with only one column containing the email addresses.

This is what the source data looks like:

enter image description here

Here is my code:

#open file and extract email address
with open('/tmp/maillist.csv', 'w') as mail_file:
    wm = csv.writer(mail_file)
    mail_list = csv.reader(open('/tmp/filtered.csv', "r"))
    for rows in mail_list:
        ','.join(rows)
        wm.writerow(rows[3])
bucket.upload_file('/tmp/maillist.csv', key)

I was hoping to get a result like this:

enter image description here

But instead, I’m getting a result like this:

enter image description here

I also tried this code:

#open file and extract email address
mail_list = csv.reader(open('/tmp/filtered.csv', "r"))
with open('/tmp/maillist.csv', 'w') as mail_file:
    wm = csv.writer(mail_file)
    wm.writerow(mail_list[3])
bucket.upload_file('/tmp/maillist.csv', key)

But I get this error instead:

Response:
{
  "errorMessage": "'_csv.reader' object is not subscriptable",
  "errorType": "TypeError",
  "stackTrace": [
    "  File "/var/task/lambda_function.py", line 68, in lambda_handlern     wm.writerow(mail_list[3])n"

Any help is appreciated.


Get this bounty!!!

#StackBounty: #amazon-web-services #aws-lambda #amazon-cloudformation #aws-api-gateway AWS API Gateway deployed API can't parse req…

Bounty: 50

I have a Lambda function integrated with API Gateway and the stack was deployed as cloud formation template. When I try to test the endpoint in the AWS web console I got correct response but when I try to invoke the deployed version of the API I got that error.

"message": "Could not parse request body into json: Unrecognized token ....etc"

I tried this mapping { "body" : $input.json('$') } in the integration request, but didn’t work.

Here is the JSON I am trying to send using POSTMAN

{
    "description": "test description",
    "status": "test status"
}

and the request has header: Content-Type: application/json

Here you are screenshots for POSTMAN request body & headers, and the response from the API:

enter image description here
enter image description here

Any Solution guys?

UPDATE:

I put a mapping template at integration request level as the following:

{
   "body-json" : $input.json('$')
}

And updated the lambda function to log the coming request, then made 2 requests:

First one: from API Gateway test web console:

I found the following in the cloudwatch logs:

INFO    {
  body: {
    description: 'test',
    projectId: 23,
    action: 'test',
    entity: 'test',
    startDate: '01-01-2020',
    endDate: '01-01-2020'
  }
}

Second one: from POSTMAN:

I found the following in the cloudwatch logs:

INFO    {
  body: 'ewogICAgImRlc2NyaXB0aW9uIjogInRlc3QiLAogICAgInByb2plY3RJZCI6IDIzLAogICAgImFjdGlvbiI6ICJ0ZXN0IiwKICAgICJlbnRpdHkiOiAidGVzdCIsCiAgICAic3RhcnREYXRlIjogIjAxLTAxLTIwMjAiLAogICAgImVuZERhdGUiOiAiMDEtMDEtMjAyMCIKfQ=='
}

That indicates that in case of making the request using POSTMAN, the JSON payload is stringified automatically. What can cause such thing? and how to deal with it?


Get this bounty!!!

#StackBounty: #amazon-web-services #aws-lambda #grpc #grpc-node How to call a grpc service running on ec2 from aws lambda

Bounty: 50

I have a grpc service, written in python, deployed on one EC2 instance. I have written a nodejs application and using the application, I am able to call the grpc service from my local machine, and also from another EC2 instance. However when I deployed the same application to Lambda (using serverless deployment), it is not able to call the same grpc service.
I have added below line in the scripts section in the package.json so the code is able to deployed to lambda properly.

"postinstall": "npm rebuild grpc --target=12.x --target_arch=x64 --target_platform=linux --target_libc=glibc"

Initially the lambda was executing without any error, just that it was not calling the grpc service.
After that I added VPC endpoint configuration in my serverless.yml file, it is returning Internal Server Error and logging error “EACCES: permission denied, open ‘/var/task/handler.js'” in cloudwatch.

What could be wrong here.

Here is the serverless.yml file:

service: myservice

provider:
  name: aws
  runtime: nodejs12.x
  vpc:
    securityGroupIds:
      - securityGroupid1
    subnetIds:
      - subnetId1
    stage: dev
  region: us-east-1
  stackTags:
    owner: me

functions:
  sendMessage:
    handler: handler.sendMessage
    events:
      - http:
          path: sendMessage
          method: post

Here is the lambda function code:

'use strict';
const AWS = require('aws-sdk');
const grpcClient = require('./grpcClient');

module.exports.sendMessage = async (event, context) => {
  const timestamp = new Date().getTime();
  console.log(event, timestamp);
  console.log(event.body);

  const message = event.body;

  let reply = 'Go Serverless v1.0! Your function executed successfully!';


     grpcClient.utterQuery({
       query: message,
       user_id: 10101,
       session_id: 321
     }, (error, riaReply) => {
         if (error) {
           console.error(error)
         } else {
           console.log('successfully queried grpc service.')
           console.log(riaReply.response)
         }
     });

  return {
    statusCode: 200,
    body: reply
  };
};


Get this bounty!!!