Kinh Nghiệm về Merge all json files in directory python 2022
Hoàng Quốc Trung đang tìm kiếm từ khóa Merge all json files in directory python được Cập Nhật vào lúc : 2022-09-30 15:12:09 . Với phương châm chia sẻ Thủ Thuật về trong nội dung bài viết một cách Chi Tiết Mới Nhất. Nếu sau khi Read tài liệu vẫn ko hiểu thì hoàn toàn có thể lại Comment ở cuối bài để Mình lý giải và hướng dẫn lại nha.Suppose there are 3 files - data1.json, data2.json, data3.json.
Nội dung chính- How do you merge JSON files in Python?How do I merge multiple JSON files in one json file in Python?How do I load multiple JSON files in Python?
Let's say data1.json contains -
"Players":[ "name":"Alexis Sanchez", "club":"Manchester United" , "name":"Robin van Persie", "club":"Feyenoord" ]data2.json contains -
"Players":[ "name":"Nicolas Pepe", "club":"Arsenal" ]data3.json contains -
"players":[ "name":"Gonzalo Higuain", "club":"Napoli" , "name":"Sunil Chettri", "club":"Bengaluru FC" ]A merge of these 3 files will generate a file with the following data. result.json -
"players":[ "name":"Alexis Sanchez", "club":"Manchester United" , "name":"Robin van Persie", "club":"Feyenoord" , "name":"Nicolas Pepe", "club":"Arsenal" , "name":"Gonzalo Higuain", "club":"Napoli" , "name":"Sunil Chettri", "club":"Bengaluru FC" ]How to open multiple JSON file from thư mục and merge them in single JSON file in python?
My Approach :
import os, json import pandas as pd path_to_json = #path for all the files. json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')] jsons_data = pd.DataFrame(columns=['name', 'club']) for index, js in enumerate(json_files): with open(os.path.join(path_to_json, js)) as json_file: json_text = json.load(json_file) name = json_text['strikers'][0]['name'] club = json_text['strikers'][0]['club'] jsons_data.loc[index] = [name, club] print(jsons_data)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
# You have file1.json and file2.json files. # Each file has structure: # ["key1": "value1"] - (in file1) # ["key2": "value2"] - (in file2) # And your goal to merge them and get next view: # ["key1": "value1", # "key2": "value2"] import json import glob result = [] for f in glob.glob("*.json"): with open(f, "rb") as infile: result.append(json.load(infile)) with open("merged_file.json", "wb") as outfile: json.dump(result, outfile, indent=4)Hi, I would like to point out that every feedback is more than welcome, I created this solution because I couldn't find it anywhere, this small script read a list of files based on the file name for example:
Let’s say that you have something like this:
09/03/2022 07:01 p. m. 173 file-20180309T200145.json 09/03/2022 11:01 p. m. 173 file-20180310T000129.json 10/03/2022 03:01 a. m. 173 file-20180310T040117.json 10/03/2022 07:01 a. m. 173 file-20180310T080111.json 10/03/2022 11:01 a. m. 173 file-20180310T120127.json 11/03/2022 03:01 p. m. 173 file-20180311T160118.jsonAnd you need to consolidate them as quickly as possible prior to ingestion, this is when this script becomes handy, as it also adds the filename to the data and you can track it to your raw data.
Assumptions:
The python script must be in the same directory as the json files.The python script and other files whitin the same thư mục MUST have a different name than the files to be mergedThe code:
import os import glob import json import pandas as pd import numpy as np import csv from findtools.find_files import (find_files, Match) from pandas.io.json import json_normalize cwd = os.getcwd() path_to_json =cwd #contents = [] File_prefix = 'file-*' dfs = [] # Recursively find all *.json files in **/home/** json_files_pattern = Match(filetype="f", name=File_prefix) found_files = find_files(path=".", match=json_files_pattern) for found_file in found_files: #----------------------------------------------------- f = open(found_file) data = json.load(f) f.close() df = pd.DataFrame.from_dict(data, orient="columns") ##set the json to a pandas dataframe in a table form to a csv df = pd.DataFrame.from_dict(data, orient="columns") df['filename'] = pd.Series(found_file) dfs.append(df) # append the data frame to the list #add the filename column #------------------------------------------------------ print ("Adding... " + found_file) temp = pd.concat(dfs, ignore_index=True) #to add multiple jsons print ("Saving ...." + File_prefix + " File") temp.to_csv("data"+ "File_prefix" + ".csv")Please share your feedback and thank you.