我有 2 个 CSV 文件:“数据”和“映射”:
Device_Name
、GDN
、Device_Type
和 Device_OS
。所有四列均已填充。Device_Name
列已填充,其他三列为空白。 Device_Name
映射其 GDN
、Device_Type
和 Device_OS
映射文件中的值。我知道在只有 2 列时如何使用 dict(需要映射 1 列),但是当需要映射 3 列时我不知道如何完成此操作。
以下是我尝试完成 Device_Type
映射的代码:
x = dict([])
with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1:
file_map = csv.reader(in_file1, delimiter=',')
for row in file_map:
typemap = [row[0],row[2]]
x.append(typemap)
with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file:
writer = csv.writer(out_file, delimiter=',')
for row in csv.reader(in_file2, delimiter=','):
try:
row[27] = x[row[11]]
except KeyError:
row[27] = ""
writer.writerow(row)
返回属性错误
。
经过一番研究,我认为我需要创建一个嵌套字典,但我不知道如何做到这一点。
最佳答案
嵌套字典是字典中的字典。很简单的事情。
>>> d = {}
>>> d['dict1'] = {}
>>> d['dict1']['innerkey'] = 'value'
>>> d['dict1']['innerkey2'] = 'value2'
>>> d
{'dict1': {'innerkey': 'value', 'innerkey2': 'value2'}}
您也可以使用 defaultdict
来自 collections
包以方便创建嵌套字典。
>>> import collections
>>> d = collections.defaultdict(dict)
>>> d['dict1']['innerkey'] = 'value'
>>> d # currently a defaultdict type
defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}})
>>> dict(d) # but is exactly like a normal dictionary.
{'dict1': {'innerkey': 'value'}}
您可以随意填充。
我会在你的代码中推荐一些类似的东西:
d = {} # can use defaultdict(dict) instead
for row in file_map:
# derive row key from something
# when using defaultdict, we can skip the next step creating a dictionary on row_key
d[row_key] = {}
for idx, col in enumerate(row):
d[row_key][idx] = col
根据您的comment :
may be above code is confusing the question. My problem in nutshell: I have 2 files a.csv b.csv, a.csv has 4 columns i j k l, b.csv also has these columns. i is kind of key columns for these csvs'. j k l column is empty in a.csv but populated in b.csv. I want to map values of j k l columns using 'i` as key column from b.csv to a.csv file
我的建议会是这样的(不使用默认字典):
a_file = "path/to/a.csv"
b_file = "path/to/b.csv"
# read from file a.csv
with open(a_file) as f:
# skip headers
f.next()
# get first colum as keys
keys = (line.split(',')[0] for line in f)
# create empty dictionary:
d = {}
# read from file b.csv
with open(b_file) as f:
# gather headers except first key header
headers = f.next().split(',')[1:]
# iterate lines
for line in f:
# gather the colums
cols = line.strip().split(',')
# check to make sure this key should be mapped.
if cols[0] not in keys:
continue
# add key to dict
d[cols[0]] = dict(
# inner keys are the header names, values are columns
(headers[idx], v) for idx, v in enumerate(cols[1:]))
但请注意,用于解析 csv 文件有一个 csv module .
https://stackoverflow.com/questions/16333296/