从jsonl文件中提取编号

我试图从json行文件中提取整数(特别是从等级ex:“ star_rating”:“ 5”)以求平均。这是文件中的示例行:

{"marketplace":"US","customer_id":"23526356","review_id":"R273DCA6Y0H9V7","product_id":"B00TCO0ZAA","product_parent":"292641483","product_title":"Professional 58mm Center Pinch Lens Cap for CANON 18-55mm , 55-250mm , 75-300mm , 50mm 1.4 , 85mm 1.8 , T5I , 70D , 60D , 7D , 7DII","product_category":"Camera","star_rating":"5","helpful_votes":"0","total_votes":"0","vine":"N","verified_purchase":"Y","review_headline":"Love it!!!","review_body":"Perfect, even sturdier than the original!","review_date":"2015-08-31"}

这是我的代码:

import boto3
import json

def lambda_handler(event, context):
     PRODUCT = event['product_id']
     S3_BUCKET = 'test'
     S3_FILE = 'tester/lines.jsonl'

     s3 = boto3.client('s3')

     r = s3.select_object_content(
          Bucket=S3_BUCKET,
          Key=S3_FILE,
          ExpressionType='SQL',
          Expression="select s.star_rating from s3object[*] s where s.product_id = '" + PRODUCT + "'",
          InputSerialization={'JSON': {"Type": "Lines"}},
          OutputSerialization={'JSON': {}}
    )

    for event in r['Payload']:
        if 'Records' in event:
            records = event['Records']['Payload'].decode('utf-8')
            #n_records = [int(s) for s in records.split() if s.isdigit()]
            #print (n_records)
            return{
               'StatusCode': 200,
               'product_id_str': PRODUCT,
               'Body': (records),
               #'Crap': (n_records)

          }

我试图将输出复制到列表中,但它为空。

本质上,如果返回输出是:

"Body": "{\"star_rating\":\"3\"}\n{\"star_rating\":\"4\"}\n{\"star_rating\":\"4\"}\n{\"star_rating\":\"2\"}\n"

然后,我试图获取这些整数的平均值。

评论