How to build a nested Dictionary type datastructure for a time series like data in python?

Refresh

November 2018

Views

39 time

2

I am trying to create a nested dictionary or similar structure for the following output :

2014-08-19 23 positive
2014-08-19 23 neutral
2014-08-19 23 positive
2014-08-19 23 bot
2014-08-19 23 positive
2014-08-19 23 positive
2014-08-19 23 bot
2014-08-19 23 positive
2014-08-19 24 positive
2014-08-19 24 positive
2014-08-19 24 bot
2014-08-19 24 positive
2014-08-20 07 positive
2014-08-20 07 positive
2014-08-20 07 positive
2014-08-20 07 bot
2014-08-20 07 positive
2014-08-20 07 neutral
2014-08-20 08 neutral
2014-08-20 08 positive
2014-08-20 08 bot
2014-08-20 08 positive
2014-08-20 08 positive
2014-08-20 08 positive
2014-08-20 08 bot
2014-08-20 08 positive

Ideally I would like the output to be something similar to the following:

2014-08-19:{
            23:{
                positive:5,neutral:1,bot:1}
            24:{
                positive:3, neutral:0,bot:1}}
2014-08-20: {
            07:{
                positive:4,neutral:1,bot:1}
            08:{
                positive:5, neutral:1,bot:2}}

and so on. Following is what I have so far:

collect_tweet={}

for line in open('time_short.txt'):
    line=line.strip().split(' ')

    if line[0] not in collect_tweet:
        collect_tweet[line[0]]= {}
        if line[1] not in collect_tweet[line[0]]:
            collect_tweet[line[0]][line[1]]=[]

    collect_tweet[line[0]][line[1]].append(line[2])

Any ideas or suggestions to accomplish this?

1 answers

1

Вы действительно близки; это должно сделать то, что вы хотите:

collect_tweet = {}

with open('time_short.txt') as file:
        for line in file.readlines():
                vals = line.rstrip().split()
                if vals[0] not in collect_tweet:
                        collect_tweet[vals[0]] = {}
                if vals[1] not in collect_tweet[vals[0]]:
                        collect_tweet[vals[0]][vals[1]] = {}
                if vals[2] not in collect_tweet[vals[0]][vals[1]]:
                        collect_tweet[vals[0]][vals[1]][vals[2]] = 1
                else:
                        collect_tweet[vals[0]][vals[1]][vals[2]] += 1

print collect_tweet