Table of Contents

Introduction

This project is a data wrangling project, which mainly focus on fixing the data quality and tidiness issues using python. The dataset that I am wrangling is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people’s dogs with a humorous comment about the dog.

These ratings almost always have a denominator of 10. And the numerators almost always greater than 10, because “they’re good dogs Brent.” The tweet archive records using in this project contains basic tweet data (tweet ID, timestamp, text, etc.) for all 2356 of their tweets as they stood on August 1, 2017.

Load Libraries

In [1]:

# Load libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import datetime
import json
import os
import requests
import string
import tweepy
from IPython.display import Image
from IPython.core.display import HTML 
%matplotlib inline

Gather the Data

I will obtain data from three sources, a manually downloaded csv file, an automatically downloaded csv file and data scraped from the Twitter API.

Twitter Archive

In [2]:

archive = pd.read_csv('twitter_archive_enhanced.csv')

Image Predictions

In [52]:

# Make directory if it doesn't already exist
folder_name = 'image_predictions'
if not os.path.exists(folder_name):
    os.makedirs(folder_name)

In [54]:

# Get data
url = 'https://d17h27t6h515a5.cloudfront.net/topher/2017/August/599fd2ad_image-predictions/image-predictions.tsv'
response = requests.get(url)

In [56]:

# Create file
with open(os.path.join(folder_name, url.split('/')[-1]), mode='wb') as file:
    file.write(response.content)

In [3]:

predictions = pd.read_csv('image_predictions/image_predictions.tsv', sep='\t')

API Data

In [4]:

consumer_key = 'HIDDEN'
consumer_secret = 'HIDDEN'
access_token = 'HIDDEN'
access_secret = 'HIDDEN'

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)

api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

In [5]:

# Get tweet info
tweet = api.get_status(archive.tweet_id[2000], tweet_mode='extended')

In [6]:

# Get json info
info = tweet._json
info

Out[6]:

{'created_at': 'Thu Dec 03 18:52:12 +0000 2015',
 'id': 672488522314567680,
 'id_str': '672488522314567680',
 'full_text': 'This is Jackie. She was all ready to go out, but her friends just cancelled on her. 10/10 hang in there Jackie https://t.co/rVfi6CCidK',
 'truncated': False,
 'display_text_range': [0, 134],
 'entities': {'hashtags': [],
  'symbols': [],
  'user_mentions': [],
  'urls': [],
  'media': [{'id': 672488519928037376,
    'id_str': '672488519928037376',
    'indices': [111, 134],
    'media_url': 'http://pbs.twimg.com/media/CVUovvHWwAAD-nu.jpg',
    'media_url_https': 'https://pbs.twimg.com/media/CVUovvHWwAAD-nu.jpg',
    'url': 'https://t.co/rVfi6CCidK',
    'display_url': 'pic.twitter.com/rVfi6CCidK',
    'expanded_url': 'https://twitter.com/dog_rates/status/672488522314567680/photo/1',
    'type': 'photo',
    'sizes': {'thumb': {'w': 150, 'h': 150, 'resize': 'crop'},
     'large': {'w': 304, 'h': 411, 'resize': 'fit'},
     'small': {'w': 304, 'h': 411, 'resize': 'fit'},
     'medium': {'w': 304, 'h': 411, 'resize': 'fit'}}}]},
 'extended_entities': {'media': [{'id': 672488519928037376,
    'id_str': '672488519928037376',
    'indices': [111, 134],
    'media_url': 'http://pbs.twimg.com/media/CVUovvHWwAAD-nu.jpg',
    'media_url_https': 'https://pbs.twimg.com/media/CVUovvHWwAAD-nu.jpg',
    'url': 'https://t.co/rVfi6CCidK',
    'display_url': 'pic.twitter.com/rVfi6CCidK',
    'expanded_url': 'https://twitter.com/dog_rates/status/672488522314567680/photo/1',
    'type': 'photo',
    'sizes': {'thumb': {'w': 150, 'h': 150, 'resize': 'crop'},
     'large': {'w': 304, 'h': 411, 'resize': 'fit'},
     'small': {'w': 304, 'h': 411, 'resize': 'fit'},
     'medium': {'w': 304, 'h': 411, 'resize': 'fit'}}}]},
 'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>',
 'in_reply_to_status_id': None,
 'in_reply_to_status_id_str': None,
 'in_reply_to_user_id': None,
 'in_reply_to_user_id_str': None,
 'in_reply_to_screen_name': None,
 'user': {'id': 4196983835,
  'id_str': '4196983835',
  'name': 'WeRateDogs™🏳️\u200d🌈',
  'screen_name': 'dog_rates',
  'location': '𝓶𝓮𝓻𝓬𝓱 ↴      DM YOUR DOGS',
  'description': 'Your Only Source for Pawfessional Dog Ratings STORE: @ShopWeRateDogs | IG, FB & SC: WeRateDogs | MOBILE APP: @GoodDogsGame Business: dogratingtwitter@gmail.com',
  'url': 'https://t.co/N7sNNHAEXS',
  'entities': {'url': {'urls': [{'url': 'https://t.co/N7sNNHAEXS',
      'expanded_url': 'http://weratedogs.com',
      'display_url': 'weratedogs.com',
      'indices': [0, 23]}]},
   'description': {'urls': []}},
  'protected': False,
  'followers_count': 6984325,
  'friends_count': 9,
  'listed_count': 4521,
  'created_at': 'Sun Nov 15 21:41:29 +0000 2015',
  'favourites_count': 134498,
  'utc_offset': None,
  'time_zone': None,
  'geo_enabled': True,
  'verified': True,
  'statuses_count': 7179,
  'lang': 'en',
  'contributors_enabled': False,
  'is_translator': False,
  'is_translation_enabled': False,
  'profile_background_color': '000000',
  'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
  'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
  'profile_background_tile': False,
  'profile_image_url': 'http://pbs.twimg.com/profile_images/948761950363664385/Fpr2Oz35_normal.jpg',
  'profile_image_url_https': 'https://pbs.twimg.com/profile_images/948761950363664385/Fpr2Oz35_normal.jpg',
  'profile_banner_url': 'https://pbs.twimg.com/profile_banners/4196983835/1525830435',
  'profile_link_color': 'F5ABB5',
  'profile_sidebar_border_color': '000000',
  'profile_sidebar_fill_color': '000000',
  'profile_text_color': '000000',
  'profile_use_background_image': False,
  'has_extended_profile': True,
  'default_profile': False,
  'default_profile_image': False,
  'following': False,
  'follow_request_sent': False,
  'notifications': False,
  'translator_type': 'none'},
 'geo': None,
 'coordinates': None,
 'place': None,
 'contributors': None,
 'is_quote_status': False,
 'retweet_count': 460,
 'favorite_count': 1151,
 'favorited': False,
 'retweeted': False,
 'possibly_sensitive': False,
 'possibly_sensitive_appealable': False,
 'lang': 'en'}

In [88]:

info['retweet_count']

Out[88]:

460

In [89]:

info['favorite_count']

Out[89]:

1151

In [86]:

info['user']['followers_count']

Out[86]:

6982890

In [19]:

print(datetime.datetime.now().time())
10:38:26.842978

In [7]:

# Make file if it doesn't already exist
file_name = 'tweet_json.txt'
if not os.path.isfile(file_name):
    open(file_name, 'w').close()

In [5]:

tweet_ids = archive.tweet_id

In [25]:

tweet_errors = {}
tweet_count = 1
data = []
for tweet_id in tweet_ids:
    try:
        # Print id counter
        print(tweet_count)
        # Collect tweet info
        tweet = api.get_status(tweet_id, tweet_mode='extended')
        info = tweet._json
        #print(info) # debug test
        #break # debug test
        # Append to file
        data.append(info)
        with open(file_name, 'w') as file:
            json.dump(data, file)
        # Print timer info to estimate time until wake-up
        print(datetime.datetime.now().time())
        # Add one to the tweet count for further printing
        tweet_count += 1
        
    except Exception as e:
        # Print exception info and add to tweet_errors dict
        print(str(tweet_id) + ": " + str(e))
        tweet_errors[str(tweet_count - 1) + "_" + str(tweet_id)] = info
1
14:35:25.741490
2
14:35:25.878125
3
14:35:26.023736
4
14:35:26.170343
5
14:35:26.311471
6
14:35:26.463066
7
14:35:26.607679
8
14:35:26.748807
9
14:35:26.892929
10
14:35:27.035053
11
14:35:27.183657
12
14:35:27.327272
13
14:35:27.486846
14
14:35:27.631459
15
14:35:27.771591
16
14:35:27.910220
17
14:35:28.056828
18
14:35:28.201442
19
14:35:28.346068
20
888202515573088257: [{'code': 144, 'message': 'No status found with that ID.'}]
20
14:35:28.609364
21
14:35:28.750986
22
14:35:28.904575
23
14:35:29.077114
24
14:35:29.246661
25
14:35:29.398255
26
14:35:29.576779
27
14:35:29.730871
28
14:35:29.888954
29
14:35:30.053514
30
14:35:30.220069
31
14:35:30.377648
32
14:35:30.543206
33
14:35:30.712752
34
14:35:30.868336
35
14:35:31.023920
36
14:35:31.186992
37
14:35:31.350061
38
14:35:31.521602
39
14:35:31.698130
40
14:35:31.850722
41
14:35:31.999326
42
14:35:32.157901
43
14:35:32.308499
44
14:35:32.474560
45
14:35:32.644611
46
14:35:32.813175
47
14:35:32.967762
48
14:35:33.118359
49
14:35:33.283916
50
14:35:33.440498
51
14:35:33.599577
52
14:35:33.769124
53
14:35:33.926703
54
14:35:34.109728
55
14:35:34.282276
56
14:35:34.436863
57
14:35:34.589455
58
14:35:34.775956
59
14:35:34.936527
60
14:35:35.114053
61
14:35:35.286592
62
14:35:35.470101
63
14:35:35.670565
64
14:35:35.830151
65
14:35:35.990722
66
14:35:36.156279
67
14:35:36.317847
68
14:35:36.488391
69
14:35:36.648963
70
14:35:36.810530
71
14:35:36.976087
72
14:35:37.181549
73
14:35:37.362082
74
14:35:37.531630
75
14:35:37.731096
76
14:35:37.922584
77
14:35:38.113075
78
14:35:38.284616
79
14:35:38.464137
80
14:35:38.631689
81
14:35:38.828174
82
14:35:39.005699
83
14:35:39.208158
84
14:35:39.384686
85
14:35:39.558222
86
14:35:39.730761
87
14:35:39.935214
88
14:35:40.151636
89
14:35:40.348121
90
14:35:40.520165
91
14:35:40.719632
92
14:35:40.914112
93
14:35:41.134523
94
14:35:41.319030
95
873697596434513921: [{'code': 144, 'message': 'No status found with that ID.'}]
95
14:35:41.639174
96
14:35:41.825181
97
14:35:42.001216
98
14:35:42.212652
99
14:35:42.402145
100
14:35:42.587649
101
14:35:42.787116
102
14:35:42.998551
103
14:35:43.180571
104
14:35:43.379062
105
14:35:43.563569
106
14:35:43.770017
107
14:35:43.956519
108
14:35:44.159975
109
14:35:44.391356
110
14:35:44.590823
111
14:35:44.793788
112
14:35:44.986283
113
14:35:45.181761
114
14:35:45.387212
115
14:35:45.590667
116
14:35:45.797116
117
869988702071779329: [{'code': 144, 'message': 'No status found with that ID.'}]
117
14:35:46.107286
118
14:35:46.303772
119
14:35:46.518215
120
14:35:46.710700
121
14:35:46.912162
122
14:35:47.116616
123
14:35:47.325059
124
14:35:47.523528
125
14:35:47.729483
126
14:35:47.921981
127
14:35:48.131925
128
14:35:48.376272
129
14:35:48.595685
130
866816280283807744: [{'code': 144, 'message': 'No status found with that ID.'}]
130
14:35:48.924806
131
14:35:49.121281
132
14:35:49.323247
133
14:35:49.536690
134
14:35:49.743642
135
14:35:49.941114
136
14:35:50.143573
137
14:35:50.349533
138
14:35:50.549998
139
14:35:50.766419
140
14:35:50.982358
141
14:35:51.190801
142
14:35:51.400242
143
14:35:51.647580
144
14:35:51.853031
145
14:35:52.070450
146
14:35:52.302829
147
14:35:52.533222
148
14:35:52.851372
149
14:35:53.052833
150
14:35:53.299175
151
14:35:53.542524
152
861769973181624320: [{'code': 144, 'message': 'No status found with that ID.'}]
152
14:35:53.878626
153
14:35:54.123986
154
14:35:54.349383
155
14:35:54.573784
156
14:35:54.884951
157
14:35:55.097383
158
14:35:55.345732
159
14:35:55.584106
160
14:35:55.797536
161
14:35:56.019941
162
14:35:56.279248
163
14:35:56.496667
164
14:35:56.737528
165
14:35:56.980891
166
14:35:57.234718
167
14:35:57.493028
168
14:35:57.744356
169
14:35:57.966761
170
14:35:58.189167
171
14:35:58.434018
172
14:35:58.653939
173
14:35:58.906265
174
14:35:59.145624
175
14:35:59.394464
176
14:35:59.648799
177
14:35:59.902123
178
14:36:00.129526
179
14:36:00.359923
180
14:36:00.591305
181
14:36:00.820692
182
14:36:01.058057
183
14:36:01.323348
184
14:36:01.592648
185
14:36:01.841993
186
14:36:02.113267
187
14:36:02.341656
188
14:36:02.619912
189
14:36:02.848303
190
14:36:03.158977
191
14:36:03.495089
192
14:36:03.837175
193
14:36:04.072545
194
14:36:04.346812
195
14:36:04.580693
196
14:36:04.832032
197
14:36:05.067906
198
14:36:05.306268
199
14:36:05.575549
200
14:36:05.823885
201
14:36:06.081197
202
14:36:06.356967
203
14:36:06.598826
204
14:36:06.841178
205
14:36:07.088517
206
14:36:07.331868
207
14:36:07.582197
208
14:36:07.827048
209
14:36:08.067909
210
14:36:08.333200
211
14:36:08.584528
212
14:36:08.840843
213
14:36:09.120096
214
14:36:09.378909
215
14:36:09.632736
216
14:36:09.892046
217
14:36:10.148357
218
14:36:10.431601
219
14:36:10.686918
220
14:36:10.945240
221
14:36:11.248430
222
14:36:11.493774
223
14:36:11.764555
224
14:36:12.011894
225
14:36:12.291651
226
14:36:12.550968
227
14:36:12.830221
228
14:36:13.089528
229
14:36:13.373769
230
14:36:13.652025
231
14:36:13.931289
232
14:36:14.184612
233
14:36:14.465860
234
14:36:14.744619
235
14:36:15.014404
236
14:36:15.291663
237
14:36:15.585893
238
14:36:15.872128
239
14:36:16.208739
240
14:36:16.467049
241
14:36:16.743311
242
14:36:17.004132
243
845459076796616705: [{'code': 144, 'message': 'No status found with that ID.'}]
243
14:36:17.390100
244
14:36:17.679833
245
14:36:18.001971
246
14:36:18.286716
247
14:36:18.569481
248
14:36:18.840756
249
14:36:19.118015
250
14:36:19.407750
251
14:36:19.696977
252
14:36:19.991203
253
14:36:20.295389
254
14:36:20.588606
255
842892208864923648: [{'code': 144, 'message': 'No status found with that ID.'}]
255
14:36:20.995518
256
14:36:21.304198
257
14:36:21.607891
258
14:36:21.906095
259
14:36:22.218260
260
14:36:22.515466
261
14:36:22.814172
262
14:36:23.083957
263
14:36:23.377173
264
14:36:23.664405
265
14:36:23.949652
266
14:36:24.255834
267
14:36:24.561028
268
14:36:24.839284
269
14:36:25.129508
270
14:36:25.427712
271
14:36:25.707962
272
14:36:25.999702
273
14:36:26.288945
274
14:36:26.565207
275
14:36:26.844460
276
14:36:27.119724
277
14:36:27.403965
278
14:36:27.688221
279
14:36:27.983431
280
14:36:28.293602
281
14:36:28.583827
282
14:36:28.878040
283
14:36:29.165282
284
14:36:29.460493
285
14:36:29.755704
286
14:36:30.042936
287
14:36:30.347122
288
14:36:30.634858
289
14:36:30.933566
290
14:36:31.233764
291
14:36:31.525982
292
837012587749474308: [{'code': 144, 'message': 'No status found with that ID.'}]
292
14:36:31.924916
293
14:36:32.237099
294
14:36:32.536299
295
14:36:32.863425
296
14:36:33.165617
297
14:36:33.474319
298
14:36:33.800440
299
14:36:34.089666
300
14:36:34.406818
301
14:36:34.694555
302
14:36:35.007235
303
14:36:35.302446
304
14:36:35.594665
305
14:36:35.928277
306
14:36:36.246426
307
14:36:36.542152
308
14:36:36.851326
309
14:36:37.181443
310
14:36:37.492611
311
14:36:37.792809
312
14:36:38.110464
313
14:36:38.431606
314
14:36:38.760232
315
14:36:39.070402
316
14:36:39.426971
317
14:36:39.771052
318
14:36:40.080225
319
14:36:40.434279
320
14:36:40.769393
321
14:36:41.068593
322
14:36:41.399708
323
14:36:41.729344
324
14:36:42.041015
325
14:36:42.373141
326
14:36:42.715227
327
14:36:43.051328
328
14:36:43.352523
329
14:36:43.692623
330
14:36:44.027738
331
14:36:44.345888
332
14:36:44.654064
333
14:36:44.984182
334
14:36:45.303843
335
14:36:45.610035
336
14:36:45.948132
337
14:36:46.273263
338
14:36:46.587423
339
14:36:46.914058
340
14:36:47.233708
341
14:36:47.559836
342
14:36:47.887468
343
14:36:48.205627
344
14:36:48.543242
345
14:36:48.895301
346
14:36:49.227413
347
14:36:49.547557
348
14:36:49.882661
349
14:36:50.196836
350
14:36:50.533935
351
14:36:50.881007
352
14:36:51.204144
353
14:36:51.546252
354
14:36:51.886344
355
14:36:52.213469
356
14:36:52.544584
357
14:36:52.891656
358
14:36:53.221786
359
14:36:53.544923
360
14:36:53.865573
361
14:36:54.212162
362
14:36:54.560231
363
14:36:54.909299
364
14:36:55.231940
365
14:36:55.635870
366
14:36:55.967002
367
14:36:56.317066
368
14:36:56.653168
369
14:36:56.982791
370
14:36:57.306442
371
14:36:57.655509
372
14:36:57.986624
373
14:36:58.344667
374
14:36:58.701240
375
827228250799742977: [{'code': 144, 'message': 'No status found with that ID.'}]
375
14:36:59.144056
376
14:36:59.504094
377
14:36:59.836709
378
14:37:00.179309
379
14:37:00.536354
380
14:37:00.894398
381
14:37:01.254435
382
14:37:01.586065
383
14:37:01.913695
384
14:37:02.269743
385
14:37:02.603850
386
14:37:02.958911
387
14:37:03.327432
388
14:37:03.688467
389
14:37:04.111352
390
14:37:04.475882
391
14:37:04.859865
392
14:37:05.214916
393
14:37:05.577452
394
14:37:05.946465
395
14:37:06.308019
396
14:37:06.666062
397
14:37:07.046046
398
14:37:07.408582
399
14:37:07.756168
400
14:37:08.109225
401
14:37:08.477241
402
14:37:08.841268
403
14:37:09.202808
404
14:37:09.571339
405
14:37:09.935366
406
14:37:10.319843
407
14:37:10.668910
408
14:37:11.029958
409
14:37:11.391990
410
14:37:11.744049
411
14:37:12.115562
412
14:37:12.506527
413
14:37:12.862575
414
14:37:13.218130
415
14:37:13.574188
416
14:37:13.953175
417
14:37:14.345631
418
14:37:14.705682
419
14:37:15.068712
420
14:37:15.440727
421
14:37:15.799767
422
14:37:16.151343
423
14:37:16.507391
424
14:37:16.882389
425
14:37:17.244924
426
14:37:17.614945
427
14:37:18.048799
428
14:37:18.419807
429
14:37:18.781840
430
14:37:19.150369
431
14:37:19.513399
432
14:37:19.909844
433
14:37:20.281863
434
14:37:20.641913
435
14:37:21.018905
436
14:37:21.401881
437
14:37:21.813791
438
14:37:22.208744
439
14:37:22.617652
440
14:37:23.013593
441
14:37:23.389094
442
14:37:23.780553
443
14:37:24.166522
444
14:37:24.541027
445
14:37:24.949943
446
14:37:25.314978
447
14:37:25.688978
448
14:37:26.095890
449
14:37:26.469905
450
14:37:26.911229
451
14:37:27.289218
452
14:37:27.670200
453
14:37:28.066166
454
14:37:28.457624
455
14:37:28.856558
456
14:37:29.234558
457
14:37:29.637492
458
14:37:30.028448
459
14:37:30.430382
460
14:37:30.836803
461
14:37:31.236746
462
14:37:31.638671
463
14:37:32.028629
464
14:37:32.442031
465
14:37:32.833489
466
14:37:33.216465
467
14:37:33.604932
468
14:37:34.015352
469
14:37:34.403315
470
14:37:34.816211
471
14:37:35.210158
472
14:37:35.620075
473
14:37:36.031973
474
14:37:36.423926
475
14:37:36.844810
476
14:37:37.249737
477
14:37:37.656649
478
14:37:38.049599
479
14:37:38.498913
480
14:37:38.905835
481
14:37:39.405500
482
14:37:39.806932
483
14:37:40.212859
484
14:37:40.605315
485
14:37:41.032173
486
14:37:41.430122
487
14:37:41.842021
488
14:37:42.248933
489
14:37:42.646881
490
14:37:43.079737
491
14:37:43.474681
492
14:37:43.880596
493
14:37:44.270070
494
14:37:44.667515
495
14:37:45.061461
496
14:37:45.460406
497
14:37:45.890762
498
14:37:46.335585
499
14:37:46.733521
500
14:37:47.132968
501
14:37:47.548857
502
14:37:47.960766
503
14:37:48.408077
504
14:37:48.811503
505
14:37:49.220410
506
14:37:49.688183
507
14:37:50.102580
508
14:37:50.512485
509
14:37:50.937349
510
14:37:51.405109
511
14:37:51.828976
512
14:37:52.246859
513
14:37:52.659766
514
14:37:53.082143
515
14:37:53.526460
516
14:37:53.983743
517
14:37:54.409120
518
14:37:54.851937
519
14:37:55.258355
520
14:37:55.666768
521
14:37:56.102618
522
14:37:56.514517
523
14:37:56.930406
524
14:37:57.340320
525
14:37:57.771169
526
14:37:58.186059
527
14:37:58.618915
528
14:37:59.061239
529
14:37:59.482114
530
14:37:59.944892
531
14:38:00.364287
532
14:38:00.809114
533
14:38:01.244949
534
14:38:01.662832
535
14:38:02.093197
536
14:38:02.581891
537
14:38:03.038670
538
14:38:03.485980
539
14:38:03.913852
540
14:38:04.335739
541
14:38:04.806995
542
14:38:05.226379
543
14:38:05.654235
544
14:38:06.115004
545
14:38:06.565812
546
14:38:07.030569
547
14:38:07.493332
548
14:38:07.938648
549
14:38:08.398946
550
14:38:08.835777
551
14:38:09.266626
552
14:38:09.704976
553
14:38:10.164748
554
14:38:10.605074
555
14:38:11.065854
556
14:38:11.508682
557
14:38:11.966458
558
802247111496568832: [{'code': 144, 'message': 'No status found with that ID.'}]
558
14:38:12.517983
559
14:38:12.976274
560
14:38:13.419090
561
14:38:13.867399
562
14:38:14.301251
563
14:38:14.753044
564
14:38:15.208825
565
14:38:15.688543
566
14:38:16.172276
567
14:38:16.625065
568
14:38:17.075860
569
14:38:17.541131
570
14:38:17.975475
571
14:38:18.446216
572
14:38:18.890042
573
14:38:19.328374
574
14:38:19.801110
575
14:38:20.245930
576
14:38:20.708704
577
14:38:21.213355
578
14:38:21.732966
579
14:38:22.176789
580
14:38:22.636560
581
14:38:23.088352
582
14:38:23.565090
583
14:38:24.064754
584
14:38:24.510563
585
14:38:25.056114
586
14:38:25.562760
587
14:38:26.032504
588
14:38:26.561609
589
14:38:27.054292
590
14:38:27.550471
591
14:38:28.018736
592
14:38:28.517402
593
14:38:28.987650
594
14:38:29.490328
595
14:38:29.992985
596
14:38:30.455254
597
14:38:30.947948
598
14:38:31.424684
599
14:38:31.905400
600
14:38:32.378146
601
14:38:32.856879
602
14:38:33.311169
603
14:38:33.798866
604
14:38:34.271120
605
14:38:34.734881
606
14:38:35.216099
607
14:38:35.695827
608
14:38:36.175558
609
14:38:36.652284
610
14:38:37.138983
611
14:38:37.639654
612
14:38:38.111393
613
14:38:38.582638
614
14:38:39.072349
615
14:38:39.537106
616
14:38:40.032298
617
14:38:40.524981
618
14:38:41.029149
619
14:38:41.506874
620
14:38:42.015042
621
14:38:42.486297
622
14:38:43.002916
623
14:38:43.501583
624
14:38:44.000260
625
14:38:44.512396
626
14:38:45.005079
627
14:38:45.528690
628
14:38:46.034368
629
14:38:46.535552
630
14:38:47.017277
631
14:38:47.568309
632
14:38:48.082945
633
14:38:48.577130
634
14:38:49.061835
635
14:38:49.561016
636
14:38:50.076638
637
14:38:50.589267
638
14:38:51.103903
639
14:38:51.605081
640
14:38:52.104745
641
14:38:52.600933
642
14:38:53.092126
643
14:38:53.601763
644
14:38:54.123392
645
14:38:54.621061
646
14:38:55.134195
647
14:38:55.668777
648
14:38:56.165982
649
14:38:56.661174
650
14:38:57.162833
651
14:38:57.681962
652
14:38:58.189112
653
14:38:58.707736
654
14:38:59.200933
655
14:38:59.703590
656
14:39:00.225699
657
14:39:00.725876
658
14:39:01.247986
659
14:39:01.763607
660
14:39:02.270757
661
14:39:02.777414
662
14:39:03.300017
663
14:39:03.790222
664
14:39:04.284911
665
14:39:04.799536
666
14:39:05.325152
667
14:39:05.829309
668
14:39:06.326979
669
14:39:06.839608
670
14:39:07.331314
671
14:39:07.842946
672
14:39:08.351587
673
14:39:08.876206
674
14:39:09.395816
675
14:39:09.921434
676
14:39:10.424090
677
14:39:10.945696
678
14:39:11.483280
679
14:39:11.998902
680
14:39:12.514030
681
14:39:13.061085
682
14:39:13.585190
683
14:39:14.126742
684
14:39:14.640873
685
14:39:15.165482
686
14:39:15.694572
687
14:39:16.212693
688
14:39:16.721344
689
14:39:17.269385
690
14:39:17.791495
691
14:39:18.309135
692
14:39:18.834730
693
14:39:19.363821
694
14:39:19.881952
695
14:39:20.411548
696
14:39:20.949111
697
14:39:21.493656
698
14:39:22.041213
699
14:39:22.555838
700
14:39:23.136791
701
14:39:23.679360
702
14:39:24.206456
703
14:39:24.735552
704
14:39:25.273126
705
14:39:25.784771
706
14:39:26.307374
707
14:39:26.821505
708
14:39:27.339149
709
14:39:27.871725
710
14:39:28.392837
711
14:39:28.920942
712
14:39:29.454033
713
14:39:30.039467
714
14:39:30.577554
715
14:39:31.115117
716
14:39:31.645698
717
14:39:32.213704
718
14:39:32.751267
719
14:39:33.310278
720
14:39:33.841374
721
14:39:34.370475
722
14:39:34.925991
723
14:39:35.460079
724
14:39:36.003133
725
14:39:36.533714
726
14:39:37.085252
727
14:39:37.637282
728
14:39:38.188806
729
14:39:38.767273
730
14:39:39.340739
731
14:39:39.891291
732
14:39:40.442321
733
14:39:40.979884
734
14:39:41.546381
735
14:39:42.121842
736
14:39:42.715279
737
14:39:43.261818
738
14:39:43.817333
739
14:39:44.363390
740
14:39:44.897961
741
14:39:45.442016
742
14:39:45.999042
743
14:39:46.577496
744
14:39:47.132033
745
14:39:47.672095
746
14:39:48.209658
747
14:39:48.753205
748
14:39:49.304247
749
14:39:49.837346
750
14:39:50.375916
751
14:39:50.925458
752
14:39:51.467011
753
14:39:52.030011
754
14:39:52.569085
755
14:39:53.150531
756
14:39:53.693588
757
14:39:54.314950
758
14:39:54.870477
759
14:39:55.436963
760
14:39:55.987491
761
14:39:56.628805
762
14:39:57.201791
763
14:39:57.773263
764
14:39:58.343752
765
14:39:58.910762
766
14:39:59.487242
767
14:40:00.044258
768
14:40:00.640663
769
14:40:01.227611
770
14:40:01.798086
771
14:40:02.345633
772
14:40:02.889202
773
14:40:03.448720
774
14:40:04.033158
775
775096608509886464: [{'code': 144, 'message': 'No status found with that ID.'}]
775
14:40:04.707869
776
14:40:05.267386
777
14:40:05.826406
778
14:40:06.378436
779
14:40:06.947431
780
14:40:07.519900
781
14:40:08.102858
782
14:40:08.672336
783
14:40:09.245310
784
14:40:09.820793
785
14:40:10.373336
786
14:40:10.953785
787
14:40:11.510801
788
14:40:12.104736
789
14:40:12.667747
790
14:40:13.236227
791
14:40:13.822175
792
14:40:14.429552
793
14:40:15.018986
794
14:40:15.597452
795
14:40:16.172914
796
14:40:16.755871
797
14:40:17.330852
798
14:40:17.895343
799
14:40:18.476296
800
14:40:19.067235
801
14:40:19.657667
802
14:40:20.285013
803
14:40:20.867456
804
14:40:21.456388
805
14:40:22.049814
806
14:40:22.622284
807
14:40:23.220190
808
14:40:23.814107
809
14:40:24.379596
810
14:40:24.965031
811
14:40:25.550980
812
14:40:26.169342
813
14:40:26.771245
814
14:40:27.367650
815
14:40:27.995477
816
14:40:28.597876
817
14:40:29.188298
818
14:40:29.839567
819
14:40:30.455942
820
14:40:31.081788
821
14:40:31.721079
822
14:40:32.296057
823
14:40:32.897450
824
14:40:33.468931
825
14:40:34.047910
826
14:40:34.660272
827
14:40:35.248215
828
14:40:35.827172
829
14:40:36.407621
830
14:40:37.005035
831
14:40:37.624379
832
14:40:38.207819
833
14:40:38.845141
834
14:40:39.433568
835
14:40:40.034960
836
14:40:40.671773
837
14:40:41.263213
838
14:40:41.882570
839
14:40:42.519452
840
14:40:43.117852
841
14:40:43.750162
842
14:40:44.349570
843
14:40:44.950992
844
14:40:45.615216
845
14:40:46.235073
846
14:40:46.825509
847
14:40:47.433883
848
14:40:48.026299
849
14:40:48.615735
850
14:40:49.246081
851
14:40:49.838526
852
14:40:50.435447
853
14:40:51.066279
854
14:40:51.675157
855
14:40:52.340897
856
14:40:52.972209
857
14:40:53.573108
858
14:40:54.203927
859
14:40:54.802840
860
14:40:55.400243
861
14:40:56.010128
862
14:40:56.641946
863
14:40:57.268785
864
14:40:57.869180
865
14:40:58.471569
866
14:40:59.081459
867
14:40:59.682851
868
14:41:00.310678
869
14:41:00.913587
870
14:41:01.528958
871
14:41:02.158287
872
14:41:02.792591
873
14:41:03.400494
874
14:41:04.015849
875
14:41:04.664140
876
14:41:05.268524
877
14:41:05.884394
878
14:41:06.495760
879
14:41:07.138054
880
14:41:07.849153
881
14:41:08.513882
882
14:41:09.141227
883
14:41:09.767059
884
14:41:10.376440
885
14:41:10.996792
886
14:41:11.624115
887
14:41:12.246966
888
14:41:12.934142
889
14:41:13.567448
890
14:41:14.193292
891
14:41:14.834087
892
14:41:15.443962
893
14:41:16.067809
894
14:41:16.698124
895
14:41:17.320978
896
14:41:17.947303
897
14:41:18.566670
898
14:41:19.214950
899
14:41:19.833321
900
14:41:20.455659
901
14:41:21.101458
902
14:41:21.733768
903
14:41:22.362112
904
14:41:22.999408
905
14:41:23.626743
906
14:41:24.268043
907
14:41:24.915326
908
14:41:25.560614
909
14:41:26.214865
910
14:41:26.877114
911
14:41:27.541339
912
14:41:28.209070
913
14:41:28.862324
914
14:41:29.526055
915
14:41:30.204760
916
14:41:30.840087
917
14:41:31.477394
918
14:41:32.107720
919
14:41:32.761971
920
14:41:33.388296
921
14:41:34.031600
922
14:41:34.690838
923
14:41:35.332145
924
14:41:35.964454
925
14:41:36.637172
926
14:41:37.267487
927
14:41:37.907791
928
14:41:38.594974
929
14:41:39.255209
930
14:41:39.898502
931
14:41:40.564746
932
14:41:41.230482
933
14:41:41.875757
934
14:41:42.529025
935
14:41:43.185270
936
14:41:43.848003
937
14:41:44.491294
938
14:41:45.158511
939
14:41:45.861641
940
14:41:46.531850
941
14:41:47.193607
942
14:41:47.836887
943
14:41:48.479171
944
14:41:49.117504
945
14:41:49.759787
946
14:41:50.411570
947
14:41:51.080781
948
14:41:51.750006
949
14:41:52.442675
950
14:41:53.125849
951
14:41:53.775134
952
14:41:54.417417
953
14:41:55.061201
954
14:41:55.729424
955
14:41:56.385182
956
14:41:57.058897
957
14:41:57.701180
958
14:41:58.373899
959
14:41:59.086993
960
14:41:59.738262
961
14:42:00.431912
962
14:42:01.090153
963
14:42:01.772340
964
14:42:02.427608
965
14:42:03.110804
966
14:42:03.780533
967
14:42:04.446764
968
14:42:05.108503
969
14:42:05.771740
970
14:42:06.436480
971
14:42:07.104694
972
14:42:07.776920
973
14:42:08.509497
974
14:42:09.229089
975
14:42:09.912281
976
14:42:10.600441
977
14:42:11.273159
978
14:42:11.967313
979
14:42:12.627548
980
14:42:13.286292
981
14:42:14.001893
982
14:42:14.702536
983
14:42:15.383231
984
14:42:16.066909
985
14:42:16.733643
986
14:42:17.408363
987
14:42:18.094539
988
14:42:18.775718
989
14:42:19.438461
990
14:42:20.131144
991
14:42:20.790889
992
14:42:21.463105
993
14:42:22.149282
994
14:42:22.851919
995
14:42:23.541077
996
14:42:24.233247
997
14:42:24.907445
998
14:42:25.567689
999
14:42:26.247871
1000
14:42:26.928571
1001
14:42:27.634230
1002
14:42:28.329395
1003
14:42:29.028033
1004
14:42:29.720696
1005
14:42:30.389438
1006
14:42:31.063646
1007
14:42:31.765793
1008
14:42:32.439990
1009
14:42:33.115699
1010
14:42:33.803886
1011
14:42:34.555876
1012
14:42:35.261515
1013
14:42:35.944690
1014
14:42:36.664282
1015
14:42:37.354437
1016
14:42:38.100959
1017
14:42:38.897336
1018
14:42:39.592991
1019
14:42:40.292135
1020
14:42:40.965348
1021
14:42:41.658011
1022
14:42:42.407029
1023
14:42:43.096187
1024
14:42:43.776883
1025
14:42:44.480522
1026
14:42:45.180650
1027
14:42:45.867329
1028
14:42:46.580423
1029
14:42:47.339910
1030
14:42:48.054001
1031
14:42:48.747663
1032
14:42:49.450783
1033
14:42:50.179351
1034
14:42:50.879479
1035
14:42:51.576134
1036
14:42:52.280756
1037
14:42:53.045227
1038
14:42:53.760325
1039
14:42:54.499362
1040
14:42:55.274300
1041
14:42:56.070185
1042
14:42:56.819687
1043
14:42:57.645995
1044
14:42:58.409964
1045
14:42:59.206340
1046
14:43:00.052092
1047
14:43:00.746740
1048
14:43:01.434433
1049
14:43:02.149034
1050
14:43:02.880093
1051
14:43:03.611158
1052
14:43:04.310794
1053
14:43:05.023908
1054
14:43:05.751482
1055
14:43:06.493007
1056
14:43:07.225050
1057
14:43:08.012966
1058
14:43:08.728054
1059
14:43:09.438166
1060
14:43:10.150263
1061
14:43:10.842431
1062
14:43:11.553531
1063
14:43:12.279106
1064
14:43:12.981229
1065
14:43:13.699329
1066
14:43:14.424898
1067
14:43:15.174892
1068
14:43:15.869037
1069
14:43:16.574666
1070
14:43:17.308727
1071
14:43:18.010850
1072
14:43:18.706999
1073
14:43:19.427075
1074
14:43:20.166119
1075
14:43:20.929097
1076
14:43:21.649675
1077
14:43:22.369776
1078
14:43:23.143214
1079
14:43:23.868292
1080
14:43:24.574404
1081
14:43:25.282019
1082
14:43:26.010575
1083
14:43:26.735157
1084
14:43:27.460721
1085
14:43:28.210235
1086
14:43:28.935297
1087
14:43:29.697284
1088
14:43:30.420365
1089
14:43:31.137458
1090
14:43:31.852568
1091
14:43:32.580622
1092
14:43:33.325642
1093
14:43:34.033772
1094
14:43:34.760828
1095
14:43:35.478909
1096
14:43:36.234405
1097
14:43:37.005357
1098
14:43:37.744392
1099
14:43:38.471450
1100
14:43:39.200511
1101
14:43:39.914113
1102
14:43:40.648666
1103
14:43:41.393704
1104
14:43:42.114776
1105
14:43:42.835377
1106
14:43:43.556449
1107
14:43:44.273048
1108
14:43:45.032540
1109
14:43:45.760100
1110
14:43:46.486675
1111
14:43:47.224209
1112
14:43:47.972725
1113
14:43:48.710752
1114
14:43:49.442312
1115
14:43:50.190312
1116
14:43:50.931344
1117
14:43:51.690315
1118
14:43:52.422377
1119
14:43:53.182346
1120
14:43:53.919879
1121
14:43:54.647458
1122
14:43:55.366044
1123
14:43:56.107084
1124
14:43:57.113427
1125
14:43:57.872398
1126
14:43:58.654308
1127
14:43:59.500567
1128
14:44:00.298434
1129
14:44:01.086842
1130
14:44:01.818912
1131
14:44:02.547963
1132
14:44:03.297979
1133
14:44:04.033052
1134
14:44:04.809997
1135
14:44:05.570492
1136
14:44:06.337946
1137
14:44:07.089955
1138
14:44:07.816014
1139
14:44:08.549055
1140
14:44:09.316026
1141
14:44:10.121384
1142
14:44:10.860924
1143
14:44:11.629375
1144
14:44:12.366919
1145
14:44:13.124893
1146
14:44:13.864915
1147
14:44:14.622910
1148
14:44:15.376400
1149
14:44:16.137881
1150
14:44:16.899362
1151
14:44:17.641379
1152
14:44:18.404854
1153
14:44:19.147388
1154
14:44:19.882434
1155
14:44:20.623968
1156
14:44:21.389920
1157
14:44:22.139926
1158
14:44:22.880945
1159
14:44:23.615989
1160
14:44:24.393435
1161
14:44:25.159892
1162
14:44:25.932836
1163
14:44:26.685832
1164
14:44:27.458776
1165
14:44:28.202316
1166
14:44:28.960290
1167
14:44:29.707304
1168
14:44:30.454318
1169
14:44:31.208317
1170
14:44:31.970799
1171
14:44:32.723303
1172
14:44:33.473310
1173
14:44:34.283156
1174
14:44:35.052122
1175
14:44:35.840016
1176
14:44:36.588535
1177
14:44:37.343527
1178
14:44:38.126444
1179
14:44:38.899390
1180
14:44:39.661866
1181
14:44:40.437335
1182
14:44:41.202290
1183
14:44:41.966763
1184
14:44:42.714764
1185
14:44:43.529102
1186
14:44:44.283601
1187
14:44:45.072011
1188
14:44:45.840974
1189
14:44:46.591986
1190
14:44:47.358453
1191
14:44:48.121929
1192
14:44:48.875913
1193
14:44:49.671810
1194
14:44:50.459703
1195
14:44:51.243684
1196
14:44:52.054516
1197
14:44:52.820983
1198
14:44:53.586459
1199
14:44:54.379846
1200
14:44:55.167763
1201
14:44:55.931240
1202
14:44:56.753054
1203
14:44:57.510545
1204
14:44:58.291470
1205
14:44:59.102303
1206
14:44:59.932096
1207
14:45:00.693575
1208
14:45:01.494459
1209
14:45:02.252949
1210
14:45:03.022891
1211
14:45:03.791352
1212
14:45:04.554825
1213
14:45:05.345235
1214
14:45:06.111207
1215
14:45:06.904592
1216
14:45:07.694481
1217
14:45:08.501838
1218
14:45:09.260315
1219
14:45:10.037262
1220
14:45:10.828650
1221
14:45:11.662946
1222
14:45:12.443859
1223
14:45:13.240234
1224
14:45:14.022144
1225
14:45:14.828000
1226
14:45:15.595464
1227
14:45:16.383359
1228
14:45:17.194210
1229
14:45:17.960683
1230
14:45:18.748577
1231
14:45:19.525017
1232
14:45:20.343349
1233
14:45:21.127768
1234
14:45:21.926632
1235
14:45:22.737478
1236
14:45:23.523891
1237
14:45:24.310788
1238
14:45:25.082252
1239
14:45:25.857691
1240
14:45:26.639108
1241
14:45:27.434004
1242
14:45:28.235887
1243
14:45:29.016826
1244
14:45:29.793762
1245
14:45:30.566716
1246
14:45:31.379567
1247
14:45:32.206863
1248
14:45:33.012215
1249
14:45:33.817075
1250
14:45:34.596016
1251
14:45:35.382433
1252
14:45:36.170841
1253
14:45:36.980698
1254
14:45:37.778071
1255
14:45:38.560495
1256
14:45:39.373842
1257
14:45:40.155270
1258
14:45:40.971101
1259
14:45:41.779939
1260
14:45:42.575821
1261
14:45:43.369699
1262
14:45:44.203575
1263
14:45:45.034857
1264
14:45:45.877111
1265
14:45:46.691943
1266
14:45:47.517757
1267
14:45:48.343069
1268
14:45:49.127993
1269
14:45:49.923393
1270
14:45:50.720262
1271
14:45:51.515654
1272
14:45:52.325503
1273
14:45:53.170765
1274
14:45:53.983604
1275
14:45:54.811420
1276
14:45:55.653674
1277
14:45:56.496935
1278
14:45:57.311772
1279
14:45:58.144080
1280
14:45:58.933980
1281
14:45:59.792685
1282
14:46:00.585072
1283
14:46:01.410885
1284
14:46:02.222236
1285
14:46:03.025110
1286
14:46:03.916727
1287
14:46:04.746025
1288
14:46:05.596750
1289
14:46:06.388161
1290
14:46:07.202506
1291
14:46:08.027311
1292
14:46:08.818710
1293
14:46:09.620587
1294
14:46:10.443388
1295
14:46:11.299112
1296
14:46:12.149837
1297
14:46:12.981144
1298
14:46:13.797961
1299
14:46:14.654194
1300
14:46:15.461059
1301
14:46:16.281865
1302
14:46:17.115171
1303
14:46:17.935976
1304
14:46:18.773748
1305
14:46:19.583593
1306
14:46:20.391959
1307
14:46:21.201794
1308
14:46:22.021613
1309
14:46:22.876328
1310
14:46:23.697155
1311
14:46:24.508999
1312
14:46:25.319338
1313
14:46:26.118716
1314
14:46:26.984423
1315
14:46:27.819248
1316
14:46:28.696421
1317
14:46:29.547657
1318
14:46:30.365983
1319
14:46:31.193770
1320
14:46:32.061964
1321
14:46:32.915207
1322
14:46:33.792881
1323
14:46:34.614191
1324
14:46:35.446977
1325
14:46:36.304194
1326
14:46:37.143970
1327
14:46:37.975755
1328
14:46:38.817514
1329
14:46:39.653290
1330
14:46:40.475606
1331
14:46:41.320358
1332
14:46:42.169592
1333
14:46:43.011352
1334
14:46:43.817209
1335
14:46:44.657982
1336
14:46:45.498260
1337
14:46:46.322070
1338
14:46:47.159850
1339
14:46:48.011574
1340
14:46:48.857323
1341
14:46:49.694101
1342
14:46:50.511451
1343
14:46:51.363185
1344
14:46:52.206949
1345
14:46:53.029749
1346
14:46:53.852560
1347
14:46:54.740199
1348
14:46:55.588446
1349
14:46:56.452651
1350
14:46:57.289415
1351
14:46:58.143637
1352
14:46:58.966940
1353
14:46:59.835124
1354
14:47:00.705820
1355
14:47:01.596439
1356
14:47:02.472614
1357
14:47:03.326356
1358
14:47:04.165125
1359
14:47:04.991925
1360
14:47:05.811754
1361
14:47:06.882891
1362
14:47:07.754572
1363
14:47:08.583860
1364
14:47:09.413153
1365
14:47:10.276855
1366
14:47:11.155023
1367
14:47:12.012248
1368
14:47:12.888915
1369
14:47:13.717708
1370
14:47:14.547003
1371
14:47:15.416688
1372
14:47:16.299329
1373
14:47:17.159544
1374
14:47:17.998312
1375
14:47:18.855022
1376
14:47:19.687808
1377
14:47:20.556003
1378
14:47:21.458599
1379
14:47:22.349723
1380
14:47:23.197961
1381
14:47:24.029757
1382
14:47:24.865552
1383
14:47:25.732246
1384
14:47:26.576001
1385
14:47:27.441699
1386
14:47:28.311374
1387
14:47:29.160117
1388
14:47:30.006371
1389
14:47:30.881548
1390
14:47:31.722301
1391
14:47:32.582044
1392
14:47:33.429296
1393
14:47:34.308461
1394
14:47:35.152721
1395
14:47:36.027383
1396
14:47:36.906549
1397
14:47:37.793682
1398
14:47:38.668344
1399
14:47:39.583414
1400
14:47:40.443630
1401
14:47:41.313305
1402
14:47:42.174015
1403
14:47:43.035722
1404
14:47:43.921377
1405
14:47:44.781089
1406
14:47:45.633820
1407
14:47:46.487547
1408
14:47:47.407594
1409
14:47:48.301230
1410
14:47:49.202336
1411
14:47:50.106930
1412
14:47:50.966149
1413
14:47:51.806902
1414
14:47:52.648159
1415
14:47:53.518850
1416
14:47:54.375560
1417
14:47:55.241759
1418
14:47:56.114427
1419
14:47:56.961173
1420
14:47:57.818902
1421
14:47:58.677607
1422
14:47:59.593169
1423
14:48:00.500767
1424
14:48:01.384405
1425
14:48:02.250113
1426
14:48:03.114802
1427
14:48:04.169994
1428
14:48:05.043669
1429
14:48:05.895402
1430
14:48:06.786032
1431
14:48:07.642259
1432
14:48:08.571291
1433
14:48:09.429995
1434
14:48:10.292214
1435
14:48:11.187821
1436
14:48:12.070964
1437
14:48:12.956116
1438
14:48:13.847741
1439
14:48:14.702476
1440
14:48:15.594093
1441
14:48:16.519639
1442
14:48:17.435202
1443
14:48:18.364233
1444
14:48:19.282296
1445
14:48:20.162977
1446
14:48:21.057617
1447
14:48:21.950735
1448
14:48:22.824419
1449
14:48:23.701075
1450
14:48:24.593714
1451
14:48:25.470874
1452
14:48:26.365996
1453
14:48:27.229687
1454
14:48:28.093388
1455
14:48:28.993992
1456
14:48:29.856686
1457
14:48:30.728869
1458
14:48:31.649940
1459
14:48:32.562552
1460
14:48:33.423270
1461
14:48:34.316900
1462
14:48:35.182104
1463
14:48:36.051294
1464
14:48:36.921977
1465
14:48:37.794151
1466
14:48:38.661831
1467
14:48:39.566442
1468
14:48:40.449082
1469
14:48:41.333233
1470
14:48:42.247788
1471
14:48:43.169336
1472
14:48:44.066956
1473
14:48:44.978568
1474
14:48:45.886155
1475
14:48:46.788761
1476
14:48:47.661932
1477
14:48:48.537601
1478
14:48:49.412776
1479
14:48:50.283952
1480
14:48:51.208490
1481
14:48:52.092642
1482
14:48:53.005202
1483
14:48:53.953687
1484
14:48:54.893188
1485
14:48:55.797290
1486
14:48:56.696886
1487
14:48:57.618432
1488
14:48:58.505575
1489
14:48:59.407199
1490
14:49:00.310794
1491
14:49:01.203420
1492
14:49:02.109514
1493
14:49:03.010610
1494
14:49:03.941636
1495
14:49:04.841735
1496
14:49:05.730359
1497
14:49:06.637944
1498
14:49:07.526578
1499
14:49:08.416716
1500
14:49:09.339259
1501
14:49:10.253814
1502
14:49:11.148927
1503
14:49:12.045036
1504
14:49:12.948140
1505
14:49:13.871682
1506
14:49:14.789240
1507
14:49:15.675881
1508
14:49:16.596429
1509
14:49:17.517486
1510
14:49:18.439038
1511
14:49:19.329686
1512
14:49:20.247264
1513
14:49:21.161335
1514
14:49:22.086386
1515
14:49:22.997476
1516
14:49:23.929984
1517
14:49:24.848055
1518
14:49:25.744659
1519
14:49:26.662217
1520
14:49:27.563327
1521
14:49:28.486377
1522
14:49:29.405919
1523
14:49:30.317986
1524
14:49:31.222086
1525
14:49:32.189529
1526
14:49:33.097102
1527
14:49:34.020653
1528
14:49:34.941193
1529
14:49:35.896659
1530
14:49:36.805242
1531
14:49:37.737265
1532
14:49:38.660323
1533
14:49:39.587860
1534
Rate limit reached. Sleeping for: 248
14:49:40.540325
1535
14:53:54.487509
1536
14:53:55.409562
1537
14:53:56.340075
1538
14:53:57.260637
1539
14:53:58.168224
1540
14:53:59.100731
1541
14:54:00.020778
1542
14:54:00.957285
1543
14:54:01.889815
1544
14:54:02.804872
1545
14:54:03.724939
1546
14:54:04.671409
1547
14:54:05.602440
1548
14:54:06.506045
1549
14:54:07.453031
1550
14:54:08.385055
1551
14:54:09.349007
1552
14:54:10.306448
1553
14:54:11.275868
1554
14:54:12.244290
1555
14:54:13.172828
1556
14:54:14.102858
1557
14:54:15.042347
1558
14:54:16.045675
1559
14:54:17.036039
1560
14:54:18.018426
1561
14:54:19.081598
1562
14:54:20.029581
1563
14:54:21.019933
1564
14:54:21.986881
1565
14:54:23.001674
1566
14:54:23.957633
1567
14:54:24.965444
1568
14:54:25.958803
1569
14:54:27.170084
1570
14:54:28.096618
1571
14:54:29.043591
1572
14:54:30.019006
1573
14:54:31.065730
1574
14:54:31.993259
1575
14:54:32.958700
1576
14:54:33.874253
1577
14:54:34.794813
1578
14:54:35.795139
1579
14:54:36.803466
1580
14:54:37.876612
1581
14:54:38.805634
1582
14:54:39.740160
1583
14:54:40.693611
1584
14:54:41.636105
1585
14:54:42.687305
1586
14:54:43.627806
1587
14:54:44.588265
1588
14:54:45.524772
1589
14:54:46.467263
1590
14:54:47.433198
1591
14:54:48.373694
1592
14:54:49.303210
1593
14:54:50.266671
1594
14:54:51.203692
1595
14:54:52.147686
1596
14:54:53.091164
1597
14:54:54.060583
1598
14:54:55.024018
1599
14:54:56.105642
1600
14:54:57.070568
1601
14:54:58.388069
1602
14:54:59.337542
1603
14:55:00.311443
1604
14:55:01.271418
1605
14:55:02.224893
1606
14:55:03.169368
1607
14:55:04.124836
1608
14:55:05.084786
1609
14:55:06.066172
1610
14:55:07.082982
1611
14:55:08.055886
1612
14:55:09.011344
1613
14:55:09.964813
1614
14:55:10.932228
1615
14:55:11.893657
1616
14:55:12.838132
1617
14:55:13.766650
1618
14:55:14.701152
1619
14:55:15.663579
1620
14:55:16.634982
1621
14:55:17.597410
1622
14:55:18.569810
1623
14:55:19.549192
1624
14:55:20.514612
1625
14:55:21.480031
1626
14:55:22.462405
1627
14:55:23.410869
1628
14:55:24.353350
1629
14:55:25.304807
1630
14:55:26.277207
1631
14:55:27.216695
1632
14:55:28.230984
1633
14:55:29.169475
1634
14:55:30.114948
1635
14:55:31.064410
1636
14:55:32.001904
1637
14:55:32.948374
1638
14:55:33.903820
1639
14:55:34.871233
1640
14:55:35.816706
1641
14:55:36.811048
1642
14:55:37.759512
1643
14:55:38.706980
1644
14:55:39.723263
1645
14:55:40.685690
1646
14:55:41.658091
1647
14:55:42.618523
1648
14:55:43.573969
1649
14:55:44.532407
1650
14:55:45.480872
1651
14:55:46.434323
1652
14:55:47.378799
1653
14:55:48.355189
1654
14:55:49.313626
1655
14:55:50.282038
1656
14:55:51.278374
1657
14:55:52.223847
1658
14:55:53.168322
1659
14:55:54.129752
1660
14:55:55.090184
1661
14:55:56.038649
1662
14:55:57.002074
1663
14:55:57.996416
1664
14:55:58.982779
1665
14:55:59.934236
1666
14:56:00.899655
1667
14:56:01.867069
1668
14:56:02.883352
1669
14:56:04.079155
1670
14:56:05.093444
1671
14:56:06.092773
1672
14:56:07.067189
1673
14:56:08.036598
1674
14:56:09.012508
1675
14:56:09.994402
1676
14:56:10.993247
1677
14:56:11.975143
1678
14:56:12.995416
1679
14:56:13.996753
1680
14:56:15.012050
1681
14:56:16.036814
1682
14:56:17.047134
1683
14:56:18.076406
1684
14:56:19.092698
1685
14:56:20.100015
1686
14:56:21.155710
1687
14:56:22.222868
1688
14:56:23.215729
1689
14:56:24.201095
1690
14:56:25.193957
1691
14:56:26.241179
1692
14:56:27.308840
1693
14:56:28.305679
1694
14:56:29.295549
1695
14:56:30.271446
1696
14:56:31.302711
1697
14:56:32.327477
1698
14:56:33.328823
1699
14:56:34.342115
1700
14:56:35.348929
1701
14:56:36.395143
1702
14:56:37.433393
1703
14:56:38.435725
1704
14:56:39.408640
1705
14:56:40.386531
1706
14:56:41.494595
1707
14:56:42.520864
1708
14:56:43.526178
1709
14:56:44.522031
1710
14:56:45.540319
1711
14:56:46.549633
1712
14:56:47.539502
1713
14:56:48.579745
1714
14:56:49.624951
1715
14:56:50.651727
1716
14:56:51.679495
1717
14:56:52.695307
1718
14:56:53.687654
1719
14:56:54.671036
1720
14:56:55.676865
1721
14:56:56.705631
1722
14:56:57.705473
1723
14:56:58.688844
1724
14:56:59.664256
1725
14:57:00.677547
1726
14:57:01.778120
1727
14:57:02.796914
1728
14:57:03.853102
1729
14:57:04.852441
1730
14:57:05.866749
1731
14:57:06.878560
1732
14:57:07.878897
1733
14:57:08.878729
1734
14:57:09.880566
1735
14:57:11.029025
1736
14:57:12.055281
1737
14:57:13.108991
1738
14:57:14.136254
1739
14:57:15.151549
1740
14:57:16.164361
1741
14:57:17.203606
1742
14:57:18.214902
1743
14:57:19.248152
1744
14:57:20.267951
1745
14:57:21.318650
1746
14:57:22.392779
1747
14:57:23.391624
1748
14:57:24.393945
1749
14:57:25.425703
1750
14:57:26.520798
1751
14:57:27.576481
1752
14:57:28.605751
1753
14:57:29.641500
1754
14:57:30.697690
1755
14:57:31.749384
1756
14:57:32.786134
1757
14:57:33.881216
1758
14:57:34.928931
1759
14:57:36.329187
1760
14:57:37.354951
1761
14:57:38.369764
1762
14:57:39.375077
1763
14:57:40.392872
1764
14:57:41.426624
1765
14:57:42.474836
1766
14:57:43.511087
1767
14:57:44.532860
1768
14:57:45.578584
1769
14:57:46.632282
1770
14:57:47.677512
1771
14:57:48.707767
1772
14:57:49.737536
1773
14:57:50.777273
1774
14:57:51.831969
1775
14:57:52.866720
1776
14:57:53.878531
1777
14:57:54.911770
1778
14:57:55.949510
1779
14:57:57.070534
1780
14:57:58.156153
1781
14:57:59.190903
1782
14:58:00.228650
1783
14:58:01.298805
1784
14:58:02.330564
1785
14:58:03.397733
1786
14:58:04.429986
1787
14:58:05.446783
1788
14:58:06.516933
1789
14:58:07.549698
1790
14:58:08.589917
1791
14:58:09.663563
1792
14:58:10.734216
1793
14:58:11.896121
1794
14:58:12.948824
1795
14:58:14.034931
1796
14:58:15.090121
1797
14:58:16.201680
1798
14:58:17.246410
1799
14:58:18.289139
1800
14:58:19.334356
1801
14:58:20.376580
1802
14:58:22.460058
1803
14:58:23.487822
1804
14:58:24.536041
1805
14:58:25.595724
1806
14:58:26.657885
1807
14:58:27.770428
1808
14:58:28.820621
1809
14:58:29.937155
1810
14:58:31.001825
1811
14:58:32.060041
1812
14:58:33.125697
1813
14:58:34.199864
1814
14:58:35.236094
1815
14:58:36.280816
1816
14:58:37.341496
1817
14:58:38.396200
1818
14:58:39.448398
1819
14:58:40.528544
1820
14:58:41.680968
1821
14:58:42.753114
1822
14:58:43.831242
1823
14:58:44.927837
1824
14:58:46.057322
1825
14:58:47.181327
1826
14:58:48.245510
1827
14:58:49.325634
1828
14:58:50.398282
1829
14:58:51.460443
1830
14:58:52.549544
1831
14:58:53.625184
1832
14:58:54.685350
1833
14:58:55.734556
1834
14:58:56.789757
1835
14:58:57.836474
1836
14:58:58.865745
1837
14:58:59.901491
1838
14:59:00.969656
1839
14:59:02.010379
1840
14:59:03.079061
1841
14:59:04.144213
1842
14:59:05.217860
1843
14:59:06.308450
1844
14:59:07.369623
1845
14:59:08.422830
1846
14:59:09.486501
1847
14:59:10.536715
1848
14:59:11.692635
1849
14:59:12.760298
1850
14:59:13.801515
1851
14:59:14.840264
1852
14:59:15.973236
1853
14:59:17.034914
1854
14:59:18.077654
1855
14:59:19.158763
1856
14:59:20.220942
1857
14:59:21.273151
1858
14:59:22.355270
1859
14:59:23.424420
1860
14:59:24.496063
1861
14:59:25.553752
1862
14:59:26.636876
1863
14:59:27.701547
1864
14:59:28.738775
1865
14:59:29.793977
1866
14:59:30.892043
1867
14:59:31.980653
1868
14:59:33.063264
1869
14:59:34.121446
1870
14:59:35.196089
1871
14:59:36.567941
1872
14:59:37.631108
1873
14:59:38.718213
1874
14:59:39.793349
1875
14:59:40.896422
1876
14:59:41.961639
1877
14:59:43.058234
1878
14:59:44.143392
1879
14:59:45.212546
1880
14:59:46.299147
1881
14:59:47.396225
1882
14:59:48.470378
1883
14:59:49.548511
1884
14:59:50.608196
1885
14:59:51.705788
1886
14:59:52.799863
1887
14:59:53.876006
1888
14:59:54.954639
1889
14:59:56.075652
1890
14:59:57.173260
1891
14:59:58.261374
1892
14:59:59.348494
1893
15:00:00.478499
1894
15:00:01.600511
1895
15:00:02.676154
1896
15:00:03.740832
1897
15:00:04.793523
1898
15:00:05.870666
1899
15:00:06.953277
1900
15:00:08.014450
1901
15:00:09.132482
1902
15:00:10.187662
1903
15:00:11.538079
1904
15:00:12.628680
1905
15:00:13.700331
1906
15:00:14.817849
1907
15:00:15.927892
1908
15:00:16.999541
1909
15:00:18.086152
1910
15:00:19.160292
1911
15:00:20.244404
1912
15:00:21.423278
1913
15:00:22.614120
1914
15:00:23.719166
1915
15:00:24.808769
1916
15:00:26.244458
1917
15:00:27.311615
1918
15:00:28.379279
1919
15:00:29.479348
1920
15:00:30.556492
1921
15:00:31.666525
1922
15:00:32.758616
1923
15:00:33.845226
1924
15:00:34.937823
1925
15:00:36.041872
1926
15:00:37.141447
1927
15:00:38.224068
1928
15:00:39.319172
1929
15:00:40.421283
1930
15:00:41.521856
1931
15:00:42.588017
1932
15:00:43.701559
1933
15:00:44.791174
1934
15:00:45.874279
1935
15:00:46.995294
1936
15:00:48.095373
1937
15:00:49.217374
1938
15:00:50.286516
1939
15:00:51.434477
1940
15:00:52.538557
1941
15:00:53.621167
1942
15:00:54.696828
1943
15:00:55.788974
1944
15:00:56.882052
1945
15:00:57.969660
1946
15:00:59.065729
1947
15:01:00.178287
1948
15:01:01.292823
1949
15:01:02.487146
1950
15:01:03.560792
1951
15:01:04.634921
1952
15:01:05.741479
1953
15:01:06.844542
1954
15:01:07.925675
1955
15:01:09.048693
1956
15:01:10.163235
1957
15:01:11.284272
1958
15:01:12.380860
1959
15:01:13.498882
1960
15:01:14.607435
1961
15:01:15.701535
1962
15:01:16.826536
1963
15:01:17.928121
1964
15:01:19.053156
1965
15:01:20.149696
1966
15:01:21.309595
1967
15:01:22.461033
1968
15:01:23.571581
1969
15:01:24.705075
1970
15:01:25.848040
1971
15:01:26.974533
1972
15:01:28.066646
1973
15:01:29.168712
1974
15:01:30.262788
1975
15:01:31.393775
1976
15:01:32.513299
1977
15:01:33.658264
1978
15:01:34.775278
1979
15:01:35.870381
1980
15:01:37.024814
1981
15:01:38.134363
1982
15:01:39.299755
1983
15:01:40.461166
1984
15:01:42.601979
1985
15:01:43.722499
1986
15:01:44.831050
1987
15:01:45.940599
1988
15:01:47.082075
1989
15:01:48.215551
1990
15:01:49.322592
1991
15:01:50.427661
1992
15:01:51.561651
1993
15:01:52.683742
1994
15:01:53.822204
1995
15:01:54.916300
1996
15:01:56.051793
1997
15:01:57.186778
1998
15:01:58.304309
1999
15:01:59.418341
2000
15:02:00.530893
2001
15:02:01.655393
2002
15:02:02.887141
2003
15:02:04.007168
2004
15:02:05.129180
2005
15:02:06.269144
2006
15:02:07.473434
2007
15:02:08.585491
2008
15:02:09.730430
2009
15:02:10.828505
2010
15:02:11.953519
2011
15:02:13.092475
2012
15:02:14.295293
2013
15:02:15.413818
2014
15:02:16.542811
2015
15:02:17.696749
2016
15:02:18.828240
2017
15:02:19.943269
2018
15:02:21.080230
2019
15:02:22.250618
2020
15:02:23.376619
2021
15:02:24.522076
2022
15:02:25.666608
2023
15:02:26.821521
2024
15:02:27.961486
2025
15:02:29.070532
2026
15:02:30.185059
2027
15:02:31.337530
2028
15:02:32.583210
2029
15:02:33.699227
2030
15:02:34.825732
2031
15:02:35.953727
2032
15:02:37.116139
2033
15:02:38.234667
2034
15:02:39.350198
2035
15:02:40.491172
2036
15:02:41.678505
2037
15:02:42.838418
2038
15:02:43.961430
2039
15:02:45.137323
2040
15:02:46.380012
2041
15:02:47.526453
2042
15:02:48.642482
2043
15:02:49.758520
2044
15:02:50.885517
2045
15:02:52.042929
2046
15:02:53.197361
2047
15:02:54.324359
2048
15:02:55.478294
2049
15:02:56.630728
2050
15:02:57.811096
2051
15:02:58.984475
2052
15:03:00.137413
2053
15:03:01.285850
2054
15:03:02.449246
2055
15:03:03.593197
2056
15:03:04.770063
2057
15:03:05.928484
2058
15:03:07.307807
2059
15:03:08.449755
2060
15:03:09.622138
2061
15:03:10.777073
2062
15:03:11.943461
2063
15:03:13.147264
2064
15:03:14.307198
2065
15:03:15.436180
2066
15:03:16.610557
2067
15:03:17.747528
2068
15:03:18.887986
2069
15:03:20.056377
2070
15:03:21.258678
2071
15:03:22.471448
2072
15:03:23.624884
2073
15:03:24.838659
2074
15:03:25.988102
2075
15:03:27.155980
2076
15:03:28.310423
2077
15:03:29.444919
2078
15:03:30.606824
2079
15:03:31.780202
2080
15:03:32.925646
2081
15:03:34.094546
2082
15:03:35.225533
2083
15:03:36.428339
2084
15:03:37.610180
2085
15:03:38.774583
2086
15:03:39.947470
2087
15:03:41.093921
2088
15:03:42.593447
2089
15:03:43.748877
2090
15:03:44.903789
2091
15:03:46.116065
2092
15:03:47.382197
2093
15:03:48.547101
2094
15:03:49.695536
2095
15:03:50.868423
2096
15:03:52.037802
2097
15:03:53.179273
2098
15:03:54.380063
2099
15:03:55.516550
2100
15:03:56.664493
2101
15:03:57.858806
2102
15:03:59.035671
2103
15:04:00.191615
2104
15:04:01.345540
2105
15:04:02.562805
2106
15:04:03.774577
2107
15:04:04.947441
2108
15:04:06.140786
2109
15:04:07.322649
2110
15:04:08.496029
2111
15:04:09.678383
2112
15:04:10.851843
2113
15:04:12.037693
2114
15:04:13.247965
2115
15:04:14.452756
2116
15:04:15.672523
2117
15:04:17.003492
2118
15:04:18.192818
2119
15:04:19.391139
2120
15:04:20.589946
2121
15:04:21.760838
2122
15:04:22.926240
2123
15:04:24.109088
2124
15:04:25.287454
2125
15:04:26.466327
2126
15:04:27.657657
2127
15:04:28.833524
2128
15:04:30.012899
2129
15:04:31.203740
2130
15:04:32.390577
2131
15:04:33.587412
2132
15:04:34.778252
2133
15:04:36.005488
2134
15:04:37.179371
2135
15:04:38.345277
2136
15:04:39.500202
2137
15:04:40.675083
2138
15:04:41.891844
2139
15:04:43.078682
2140
15:04:44.242096
2141
15:04:45.400514
2142
15:04:46.617791
2143
15:04:47.899379
2144
15:04:49.167011
2145
15:04:50.357828
2146
15:04:51.907222
2147
15:04:53.092074
2148
15:04:54.295372
2149
15:04:55.456291
2150
15:04:56.631171
2151
15:04:57.837946
2152
15:04:59.013330
2153
15:05:00.207150
2154
15:05:01.402964
2155
15:05:02.706993
2156
15:05:03.970625
2157
15:05:05.165440
2158
15:05:06.407121
2159
15:05:07.609931
2160
15:05:08.817734
2161
15:05:10.006556
2162
15:05:11.268687
2163
15:05:12.514368
2164
15:05:13.747072
2165
15:05:14.923939
2166
15:05:16.097822
2167
15:05:17.354498
2168
15:05:18.559338
2169
15:05:19.732716
2170
15:05:20.949994
2171
15:05:22.155791
2172
15:05:23.375550
2173
15:05:24.569889
2174
15:05:25.753251
2175
15:05:26.998932
2176
15:05:28.216231
2177
15:05:29.412553
2178
15:05:30.618847
2179
15:05:31.926374
2180
15:05:33.130668
2181
15:05:34.319994
2182
15:05:35.499346
2183
15:05:36.746035
2184
15:05:37.958310
2185
15:05:39.186046
2186
15:05:40.389840
2187
15:05:41.743330
2188
15:05:42.934166
2189
15:05:44.149933
2190
15:05:45.358732
2191
15:05:46.621357
2192
15:05:47.934373
2193
15:05:49.167587
2194
15:05:50.364904
2195
15:05:51.651490
2196
15:05:52.867258
2197
15:05:54.082026
2198
15:05:55.289833
2199
15:05:56.474183
2200
15:05:57.676484
2201
15:05:58.956568
2202
15:06:00.140918
2203
15:06:01.373139
2204
15:06:02.593402
2205
15:06:03.806180
2206
15:06:05.012465
2207
15:06:06.254177
2208
15:06:07.460457
2209
15:06:08.675220
2210
15:06:09.917408
2211
15:06:11.133185
2212
15:06:12.357921
2213
15:06:13.574691
2214
15:06:14.796948
2215
15:06:16.005244
2216
15:06:17.362133
2217
15:06:18.565914
2218
15:06:19.790661
2219
15:06:20.995441
2220
15:06:22.250601
2221
15:06:23.476839
2222
15:06:24.667174
2223
15:06:25.852519
2224
15:06:27.054326
2225
15:06:28.271580
2226
15:06:29.519254
2227
15:06:30.745517
2228
15:06:31.982783
2229
15:06:33.206542
2230
15:06:34.439247
2231
15:06:35.662986
2232
15:06:36.918150
2233
15:06:38.150886
2234
15:06:39.376693
2235
15:06:40.612404
2236
15:06:41.894494
2237
15:06:43.090306
2238
15:06:44.320029
2239
15:06:45.534298
2240
15:06:46.749566
2241
15:06:47.993760
2242
15:06:49.232469
2243
15:06:50.457699
2244
15:06:51.682949
2245
15:06:52.940587
2246
15:06:54.162347
2247
15:06:55.391577
2248
15:06:56.643737
2249
15:06:57.896414
2250
15:06:59.160550
2251
15:07:00.374314
2252
15:07:01.628971
2253
15:07:02.904081
2254
15:07:04.150770
2255
15:07:05.406932
2256
15:07:06.638654
2257
15:07:07.912276
2258
15:07:09.161471
2259
15:07:10.407162
2260
15:07:11.656328
2261
15:07:12.904004
2262
15:07:14.135237
2263
15:07:15.385441
2264
15:07:16.635100
2265
15:07:17.889266
2266
15:07:19.127966
2267
15:07:20.352209
2268
15:07:21.578951
2269
15:07:22.863542
2270
15:07:24.116708
2271
15:07:25.360900
2272
15:07:26.595104
2273
15:07:27.875705
2274
15:07:29.084499
2275
15:07:30.301257
2276
15:07:31.602799
2277
15:07:32.824049
2278
15:07:34.042813
2279
15:07:35.319422
2280
15:07:36.587033
2281
15:07:37.851172
2282
15:07:39.154700
2283
15:07:40.400410
2284
15:07:41.645587
2285
15:07:42.980029
2286
15:07:44.218729
2287
15:07:45.442971
2288
15:07:46.705618
2289
15:07:47.986709
2290
15:07:49.239360
2291
15:07:50.499005
2292
15:07:52.125701
2293
15:07:53.411768
2294
15:07:54.656466
2295
15:07:55.925097
2296
15:07:57.261046
2297
15:07:58.541139
2298
15:07:59.776871
2299
15:08:01.072933
2300
15:08:02.365981
2301
15:08:03.602686
2302
15:08:04.843380
2303
15:08:06.094551
2304
15:08:07.357187
2305
15:08:08.600862
2306
15:08:09.878457
2307
15:08:11.128645
2308
15:08:12.385825
2309
15:08:13.626542
2310
15:08:14.876714
2311
15:08:16.129365
2312
15:08:17.574022
2313
15:08:18.854116
2314
15:08:20.101781
2315
15:08:21.372911
2316
15:08:22.669458
2317
15:08:23.942056
2318
15:08:25.199212
2319
15:08:26.556605
2320
15:08:27.909503
2321
15:08:29.152192
2322
15:08:30.406365
2323
15:08:31.699414
2324
15:08:32.996476
2325
15:08:34.241189
2326
15:08:35.494344
2327
15:08:36.866700
2328
15:08:38.117365
2329
15:08:39.367034
2330
15:08:40.654618
2331
15:08:42.009010
2332
15:08:43.362406
2333
15:08:44.613578
2334
15:08:45.889181
2335
15:08:47.242100
2336
15:08:48.493754
2337
15:08:49.735467
2338
15:08:51.015046
2339
15:08:52.300127
2340
15:08:53.583695
2341
15:08:54.852318
2342
15:08:56.170793
2343
15:08:57.514215
2344
15:08:58.847182
2345
15:09:00.147223

In [ ]:

data = {}
data['tweets'] = []
tweet_errors = {}
tweet_count = 1
for tweet_id in tweet_ids:
    try:
        # Print id counter
        print(tweet_count)
        # Collect tweet info
        tweet = api.get_status(tweet_id, tweet_mode='extended')
        info = tweet._json
        # Collect specific data
        retweet_count = info['retweet_count']
        favorite_count = info['favorite_count']
        followers_count = info['user']['followers_count']
        # Append to data dict
        data['tweets'].append({
            'tweet_id': tweet_id, 
            'retweet_count': retweet_count, 
            'favorite_count': favorite_count,
            'followers_count': followers_count
        })
        #print(retweet_count, favorite_count, followers_count) # debug test
        #print(data)
        #break # debug test
        # Print timer info to estimate time until wake-up
        print(datetime.datetime.now().time())
        # Add one to the tweet count for further printing
        tweet_count += 1
        
    except Exception as e:
        # Print exception info and add to tweet_errors dict
        print(str(tweet_count) + "_" + str(tweet_id) + ": " + str(e))
        tweet_errors[str(tweet_count) + "_" + str(tweet_id)] = info

In [5]:

# Extract data from file
df_list = []
with open('tweet_json.txt') as json_file:
    data = json.load(json_file)
    for tweet in data:
        df_list.append({'tweet_id': tweet['id'],
                        'retweet_count': tweet['retweet_count'], 
                        'favorite_count': tweet['favorite_count'],
                        'followers_count': tweet['user']['followers_count']})

In [6]:

# Create DataFrame from list of dictionaries
api_data = pd.DataFrame(df_list, columns = ['tweet_id', 
                                            'retweet_count', 
                                            'favorite_count', 
                                            'followers_count'])

In [32]:

tweet_errors.keys()

Out[32]:

dict_keys(['19_888202515573088257', '94_873697596434513921', '116_869988702071779329', '129_866816280283807744', '151_861769973181624320', '242_845459076796616705', '254_842892208864923648', '291_837012587749474308', '374_827228250799742977', '557_802247111496568832', '774_775096608509886464'])

Resources:

Assess the Data

archive table

In [68]:

archive

Out[68]:

tweet_idin_reply_to_status_idin_reply_to_user_idtimestampsourcetextretweeted_status_idretweeted_status_user_idretweeted_status_timestampexpanded_urlsrating_numeratorrating_denominatornamedoggoflooferpupperpuppo
0892420643555336193NaNNaN2017-08-01 16:23:56 +0000<a href=”http://twitter.com/download/iphone” r…This is Phineas. He’s a mystical boy. Only eve…NaNNaNNaNhttps://twitter.com/dog_rates/status/892420643…1310PhineasNoneNoneNoneNone
1892177421306343426NaNNaN2017-08-01 00:17:27 +0000<a href=”http://twitter.com/download/iphone” r…This is Tilly. She’s just checking pup on you….NaNNaNNaNhttps://twitter.com/dog_rates/status/892177421…1310TillyNoneNoneNoneNone
2891815181378084864NaNNaN2017-07-31 00:18:03 +0000<a href=”http://twitter.com/download/iphone” r…This is Archie. He is a rare Norwegian Pouncin…NaNNaNNaNhttps://twitter.com/dog_rates/status/891815181…1210ArchieNoneNoneNoneNone
3891689557279858688NaNNaN2017-07-30 15:58:51 +0000<a href=”http://twitter.com/download/iphone” r…This is Darla. She commenced a snooze mid meal…NaNNaNNaNhttps://twitter.com/dog_rates/status/891689557…1310DarlaNoneNoneNoneNone
4891327558926688256NaNNaN2017-07-29 16:00:24 +0000<a href=”http://twitter.com/download/iphone” r…This is Franklin. He would like you to stop ca…NaNNaNNaNhttps://twitter.com/dog_rates/status/891327558…1210FranklinNoneNoneNoneNone
5891087950875897856NaNNaN2017-07-29 00:08:17 +0000<a href=”http://twitter.com/download/iphone” r…Here we have a majestic great white breaching …NaNNaNNaNhttps://twitter.com/dog_rates/status/891087950…1310NoneNoneNoneNoneNone
6890971913173991426NaNNaN2017-07-28 16:27:12 +0000<a href=”http://twitter.com/download/iphone” r…Meet Jax. He enjoys ice cream so much he gets …NaNNaNNaNhttps://gofundme.com/ydvmve-surgery-for-jax,ht…1310JaxNoneNoneNoneNone
7890729181411237888NaNNaN2017-07-28 00:22:40 +0000<a href=”http://twitter.com/download/iphone” r…When you watch your owner call another dog a g…NaNNaNNaNhttps://twitter.com/dog_rates/status/890729181…1310NoneNoneNoneNoneNone
8890609185150312448NaNNaN2017-07-27 16:25:51 +0000<a href=”http://twitter.com/download/iphone” r…This is Zoey. She doesn’t want to be one of th…NaNNaNNaNhttps://twitter.com/dog_rates/status/890609185…1310ZoeyNoneNoneNoneNone
9890240255349198849NaNNaN2017-07-26 15:59:51 +0000<a href=”http://twitter.com/download/iphone” r…This is Cassie. She is a college pup. Studying…NaNNaNNaNhttps://twitter.com/dog_rates/status/890240255…1410CassiedoggoNoneNoneNone
10890006608113172480NaNNaN2017-07-26 00:31:25 +0000<a href=”http://twitter.com/download/iphone” r…This is Koda. He is a South Australian decksha…NaNNaNNaNhttps://twitter.com/dog_rates/status/890006608…1310KodaNoneNoneNoneNone
11889880896479866881NaNNaN2017-07-25 16:11:53 +0000<a href=”http://twitter.com/download/iphone” r…This is Bruno. He is a service shark. Only get…NaNNaNNaNhttps://twitter.com/dog_rates/status/889880896…1310BrunoNoneNoneNoneNone
12889665388333682689NaNNaN2017-07-25 01:55:32 +0000<a href=”http://twitter.com/download/iphone” r…Here’s a puppo that seems to be on the fence a…NaNNaNNaNhttps://twitter.com/dog_rates/status/889665388…1310NoneNoneNoneNonepuppo
13889638837579907072NaNNaN2017-07-25 00:10:02 +0000<a href=”http://twitter.com/download/iphone” r…This is Ted. He does his best. Sometimes that’…NaNNaNNaNhttps://twitter.com/dog_rates/status/889638837…1210TedNoneNoneNoneNone
14889531135344209921NaNNaN2017-07-24 17:02:04 +0000<a href=”http://twitter.com/download/iphone” r…This is Stuart. He’s sporting his favorite fan…NaNNaNNaNhttps://twitter.com/dog_rates/status/889531135…1310StuartNoneNoneNonepuppo
15889278841981685760NaNNaN2017-07-24 00:19:32 +0000<a href=”http://twitter.com/download/iphone” r…This is Oliver. You’re witnessing one of his m…NaNNaNNaNhttps://twitter.com/dog_rates/status/889278841…1310OliverNoneNoneNoneNone
16888917238123831296NaNNaN2017-07-23 00:22:39 +0000<a href=”http://twitter.com/download/iphone” r…This is Jim. He found a fren. Taught him how t…NaNNaNNaNhttps://twitter.com/dog_rates/status/888917238…1210JimNoneNoneNoneNone
17888804989199671297NaNNaN2017-07-22 16:56:37 +0000<a href=”http://twitter.com/download/iphone” r…This is Zeke. He has a new stick. Very proud o…NaNNaNNaNhttps://twitter.com/dog_rates/status/888804989…1310ZekeNoneNoneNoneNone
18888554962724278272NaNNaN2017-07-22 00:23:06 +0000<a href=”http://twitter.com/download/iphone” r…This is Ralphus. He’s powering up. Attempting …NaNNaNNaNhttps://twitter.com/dog_rates/status/888554962…1310RalphusNoneNoneNoneNone
19888202515573088257NaNNaN2017-07-21 01:02:36 +0000<a href=”http://twitter.com/download/iphone” r…RT @dog_rates: This is Canela. She attempted s…8.874740e+174.196984e+092017-07-19 00:47:34 +0000https://twitter.com/dog_rates/status/887473957…1310CanelaNoneNoneNoneNone
20888078434458587136NaNNaN2017-07-20 16:49:33 +0000<a href=”http://twitter.com/download/iphone” r…This is Gerald. He was just told he didn’t get…NaNNaNNaNhttps://twitter.com/dog_rates/status/888078434…1210GeraldNoneNoneNoneNone
21887705289381826560NaNNaN2017-07-19 16:06:48 +0000<a href=”http://twitter.com/download/iphone” r…This is Jeffrey. He has a monopoly on the pool…NaNNaNNaNhttps://twitter.com/dog_rates/status/887705289…1310JeffreyNoneNoneNoneNone
22887517139158093824NaNNaN2017-07-19 03:39:09 +0000<a href=”http://twitter.com/download/iphone” r…I’ve yet to rate a Venezuelan Hover Wiener. Th…NaNNaNNaNhttps://twitter.com/dog_rates/status/887517139…1410suchNoneNoneNoneNone
23887473957103951883NaNNaN2017-07-19 00:47:34 +0000<a href=”http://twitter.com/download/iphone” r…This is Canela. She attempted some fancy porch…NaNNaNNaNhttps://twitter.com/dog_rates/status/887473957…1310CanelaNoneNoneNoneNone
24887343217045368832NaNNaN2017-07-18 16:08:03 +0000<a href=”http://twitter.com/download/iphone” r…You may not have known you needed to see this …NaNNaNNaNhttps://twitter.com/dog_rates/status/887343217…1310NoneNoneNoneNoneNone
25887101392804085760NaNNaN2017-07-18 00:07:08 +0000<a href=”http://twitter.com/download/iphone” r…This… is a Jubilant Antarctic House Bear. We…NaNNaNNaNhttps://twitter.com/dog_rates/status/887101392…1210NoneNoneNoneNoneNone
26886983233522544640NaNNaN2017-07-17 16:17:36 +0000<a href=”http://twitter.com/download/iphone” r…This is Maya. She’s very shy. Rarely leaves he…NaNNaNNaNhttps://twitter.com/dog_rates/status/886983233…1310MayaNoneNoneNoneNone
27886736880519319552NaNNaN2017-07-16 23:58:41 +0000<a href=”http://twitter.com/download/iphone” r…This is Mingus. He’s a wonderful father to his…NaNNaNNaNhttps://www.gofundme.com/mingusneedsus,https:/…1310MingusNoneNoneNoneNone
28886680336477933568NaNNaN2017-07-16 20:14:00 +0000<a href=”http://twitter.com/download/iphone” r…This is Derek. He’s late for a dog meeting. 13…NaNNaNNaNhttps://twitter.com/dog_rates/status/886680336…1310DerekNoneNoneNoneNone
29886366144734445568NaNNaN2017-07-15 23:25:31 +0000<a href=”http://twitter.com/download/iphone” r…This is Roscoe. Another pupper fallen victim t…NaNNaNNaNhttps://twitter.com/dog_rates/status/886366144…1210RoscoeNoneNonepupperNone
2326666411507551481857NaNNaN2015-11-17 00:24:19 +0000<a href=”http://twitter.com/download/iphone” r…This is quite the dog. Gets really excited whe…NaNNaNNaNhttps://twitter.com/dog_rates/status/666411507…210quiteNoneNoneNoneNone
2327666407126856765440NaNNaN2015-11-17 00:06:54 +0000<a href=”http://twitter.com/download/iphone” r…This is a southern Vesuvius bumblegruff. Can d…NaNNaNNaNhttps://twitter.com/dog_rates/status/666407126…710aNoneNoneNoneNone
2328666396247373291520NaNNaN2015-11-16 23:23:41 +0000<a href=”http://twitter.com/download/iphone” r…Oh goodness. A super rare northeast Qdoba kang…NaNNaNNaNhttps://twitter.com/dog_rates/status/666396247…910NoneNoneNoneNoneNone
2329666373753744588802NaNNaN2015-11-16 21:54:18 +0000<a href=”http://twitter.com/download/iphone” r…Those are sunglasses and a jean jacket. 11/10 …NaNNaNNaNhttps://twitter.com/dog_rates/status/666373753…1110NoneNoneNoneNoneNone
2330666362758909284353NaNNaN2015-11-16 21:10:36 +0000<a href=”http://twitter.com/download/iphone” r…Unique dog here. Very small. Lives in containe…NaNNaNNaNhttps://twitter.com/dog_rates/status/666362758…610NoneNoneNoneNoneNone
2331666353288456101888NaNNaN2015-11-16 20:32:58 +0000<a href=”http://twitter.com/download/iphone” r…Here we have a mixed Asiago from the Galápagos…NaNNaNNaNhttps://twitter.com/dog_rates/status/666353288…810NoneNoneNoneNoneNone
2332666345417576210432NaNNaN2015-11-16 20:01:42 +0000<a href=”http://twitter.com/download/iphone” r…Look at this jokester thinking seat belt laws …NaNNaNNaNhttps://twitter.com/dog_rates/status/666345417…1010NoneNoneNoneNoneNone
2333666337882303524864NaNNaN2015-11-16 19:31:45 +0000<a href=”http://twitter.com/download/iphone” r…This is an extremely rare horned Parthenon. No…NaNNaNNaNhttps://twitter.com/dog_rates/status/666337882…910anNoneNoneNoneNone
2334666293911632134144NaNNaN2015-11-16 16:37:02 +0000<a href=”http://twitter.com/download/iphone” r…This is a funny dog. Weird toes. Won’t come do…NaNNaNNaNhttps://twitter.com/dog_rates/status/666293911…310aNoneNoneNoneNone
2335666287406224695296NaNNaN2015-11-16 16:11:11 +0000<a href=”http://twitter.com/download/iphone” r…This is an Albanian 3 1/2 legged Episcopalian…NaNNaNNaNhttps://twitter.com/dog_rates/status/666287406…12anNoneNoneNoneNone
2336666273097616637952NaNNaN2015-11-16 15:14:19 +0000<a href=”http://twitter.com/download/iphone” r…Can take selfies 11/10 https://t.co/ws2AMaNwPWNaNNaNNaNhttps://twitter.com/dog_rates/status/666273097…1110NoneNoneNoneNoneNone
2337666268910803644416NaNNaN2015-11-16 14:57:41 +0000<a href=”http://twitter.com/download/iphone” r…Very concerned about fellow dog trapped in com…NaNNaNNaNhttps://twitter.com/dog_rates/status/666268910…1010NoneNoneNoneNoneNone
2338666104133288665088NaNNaN2015-11-16 04:02:55 +0000<a href=”http://twitter.com/download/iphone” r…Not familiar with this breed. No tail (weird)….NaNNaNNaNhttps://twitter.com/dog_rates/status/666104133…110NoneNoneNoneNoneNone
2339666102155909144576NaNNaN2015-11-16 03:55:04 +0000<a href=”http://twitter.com/download/iphone” r…Oh my. Here you are seeing an Adobe Setter giv…NaNNaNNaNhttps://twitter.com/dog_rates/status/666102155…1110NoneNoneNoneNoneNone
2340666099513787052032NaNNaN2015-11-16 03:44:34 +0000<a href=”http://twitter.com/download/iphone” r…Can stand on stump for what seems like a while…NaNNaNNaNhttps://twitter.com/dog_rates/status/666099513…810NoneNoneNoneNoneNone
2341666094000022159362NaNNaN2015-11-16 03:22:39 +0000<a href=”http://twitter.com/download/iphone” r…This appears to be a Mongolian Presbyterian mi…NaNNaNNaNhttps://twitter.com/dog_rates/status/666094000…910NoneNoneNoneNoneNone
2342666082916733198337NaNNaN2015-11-16 02:38:37 +0000<a href=”http://twitter.com/download/iphone” r…Here we have a well-established sunblockerspan…NaNNaNNaNhttps://twitter.com/dog_rates/status/666082916…610NoneNoneNoneNoneNone
2343666073100786774016NaNNaN2015-11-16 01:59:36 +0000<a href=”http://twitter.com/download/iphone” r…Let’s hope this flight isn’t Malaysian (lol). …NaNNaNNaNhttps://twitter.com/dog_rates/status/666073100…1010NoneNoneNoneNoneNone
2344666071193221509120NaNNaN2015-11-16 01:52:02 +0000<a href=”http://twitter.com/download/iphone” r…Here we have a northern speckled Rhododendron….NaNNaNNaNhttps://twitter.com/dog_rates/status/666071193…910NoneNoneNoneNoneNone
2345666063827256086533NaNNaN2015-11-16 01:22:45 +0000<a href=”http://twitter.com/download/iphone” r…This is the happiest dog you will ever see. Ve…NaNNaNNaNhttps://twitter.com/dog_rates/status/666063827…1010theNoneNoneNoneNone
2346666058600524156928NaNNaN2015-11-16 01:01:59 +0000<a href=”http://twitter.com/download/iphone” r…Here is the Rand Paul of retrievers folks! He’…NaNNaNNaNhttps://twitter.com/dog_rates/status/666058600…810theNoneNoneNoneNone
2347666057090499244032NaNNaN2015-11-16 00:55:59 +0000<a href=”http://twitter.com/download/iphone” r…My oh my. This is a rare blond Canadian terrie…NaNNaNNaNhttps://twitter.com/dog_rates/status/666057090…910aNoneNoneNoneNone
2348666055525042405380NaNNaN2015-11-16 00:49:46 +0000<a href=”http://twitter.com/download/iphone” r…Here is a Siberian heavily armored polar bear …NaNNaNNaNhttps://twitter.com/dog_rates/status/666055525…1010aNoneNoneNoneNone
2349666051853826850816NaNNaN2015-11-16 00:35:11 +0000<a href=”http://twitter.com/download/iphone” r…This is an odd dog. Hard on the outside but lo…NaNNaNNaNhttps://twitter.com/dog_rates/status/666051853…210anNoneNoneNoneNone
2350666050758794694657NaNNaN2015-11-16 00:30:50 +0000<a href=”http://twitter.com/download/iphone” r…This is a truly beautiful English Wilson Staff…NaNNaNNaNhttps://twitter.com/dog_rates/status/666050758…1010aNoneNoneNoneNone
2351666049248165822465NaNNaN2015-11-16 00:24:50 +0000<a href=”http://twitter.com/download/iphone” r…Here we have a 1949 1st generation vulpix. Enj…NaNNaNNaNhttps://twitter.com/dog_rates/status/666049248…510NoneNoneNoneNoneNone
2352666044226329800704NaNNaN2015-11-16 00:04:52 +0000<a href=”http://twitter.com/download/iphone” r…This is a purebred Piers Morgan. Loves to Netf…NaNNaNNaNhttps://twitter.com/dog_rates/status/666044226…610aNoneNoneNoneNone
2353666033412701032449NaNNaN2015-11-15 23:21:54 +0000<a href=”http://twitter.com/download/iphone” r…Here is a very happy pup. Big fan of well-main…NaNNaNNaNhttps://twitter.com/dog_rates/status/666033412…910aNoneNoneNoneNone
2354666029285002620928NaNNaN2015-11-15 23:05:30 +0000<a href=”http://twitter.com/download/iphone” r…This is a western brown Mitsubishi terrier. Up…NaNNaNNaNhttps://twitter.com/dog_rates/status/666029285…710aNoneNoneNoneNone
2355666020888022790149NaNNaN2015-11-15 22:32:08 +0000<a href=”http://twitter.com/download/iphone” r…Here we have a Japanese Irish Setter. Lost eye…NaNNaNNaNhttps://twitter.com/dog_rates/status/666020888…810NoneNoneNoneNoneNone

2356 rows × 17 columnsIn [69]:

archive.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2356 entries, 0 to 2355
Data columns (total 17 columns):
tweet_id                      2356 non-null int64
in_reply_to_status_id         78 non-null float64
in_reply_to_user_id           78 non-null float64
timestamp                     2356 non-null object
source                        2356 non-null object
text                          2356 non-null object
retweeted_status_id           181 non-null float64
retweeted_status_user_id      181 non-null float64
retweeted_status_timestamp    181 non-null object
expanded_urls                 2297 non-null object
rating_numerator              2356 non-null int64
rating_denominator            2356 non-null int64
name                          2356 non-null object
doggo                         2356 non-null object
floofer                       2356 non-null object
pupper                        2356 non-null object
puppo                         2356 non-null object
dtypes: float64(4), int64(3), object(10)
memory usage: 313.0+ KB

In [29]:

# Count number of not 'None' values in columns 'doggo' to 'puppo'
(archive.loc[:,'doggo':'puppo'] != 'None').sum()

Out[29]:

doggo       97
floofer     10
pupper     257
puppo       30
dtype: int64

In [31]:

# Count number of cells of `text` with doggo, floofer, pupper, and puppo
for column in archive.columns[-4:]:
    print(column, archive.text.str.contains(column).sum())
doggo 98
floofer 4
pupper 272
puppo 37

In [38]:

# Check if name is always captured
archive[['text', 'name']].sample(10)

Out[38]:

textname
1128This is Stefan. He’s a downright remarkable pu…Stefan
2172Just got home from college. Dis my dog. She do…None
935This is Scout. Her batteries are low. 12/10 pr…Scout
518This is Pavlov. His floatation device has fail…Pavlov
1132When you’re way too slow for the “down low” po…None
1891These two pups are masters of camouflage. Very…None
684Atlas is back and this time he’s got doggles. …None
2269This a Norwegian Pewterschmidt named Tickles. …None
1583Army of water dogs here. None of them know whe…None
904This is Corey. He’s a Portobello Corgicool. Tr…Corey

In [39]:

# Identify example of missing name
archive.text[2269]

Out[39]:

'This a Norwegian Pewterschmidt named Tickles. Ears for days. 12/10 I care deeply for Tickles https://t.co/0aDF62KVP7'

In [19]:

# Identify example of two names
archive.text[2232]
These two dogs are Bo & Smittens. Smittens is trying out a new deodorant and wanted Bo to smell it. 10/10 true pals https://t.co/4pw1QQ6udh

In [79]:

archive.name.value_counts()

Out[79]:

None         745
a             55
Charlie       12
Oliver        11
Cooper        11
Lucy          11
Lola          10
Tucker        10
Penny         10
Bo             9
Winston        9
Sadie          8
the            8
an             7
Toby           7
Daisy          7
Bailey         7
Buddy          7
Jax            6
Scout          6
Bella          6
Oscar          6
Jack           6
Rusty          6
Stanley        6
Milo           6
Leo            6
Dave           6
Koda           6
Gus            5
            ... 
Taco           1
Bert           1
Alexander      1
Rorie          1
Shikha         1
Snoop          1
old            1
Deacon         1
Grady          1
Yoda           1
Duchess        1
Ivar           1
Kathmandu      1
Sid            1
Dobby          1
Brudge         1
Sandra         1
Genevieve      1
Lillie         1
Dewey          1
Tedrick        1
Leonard        1
Bobby          1
Mookie         1
O              1
Rooney         1
Dook           1
Rinna          1
Kendall        1
Alfy           1
Name: name, Length: 957, dtype: int64

In [120]:

archive.rating_numerator.describe()

Out[120]:

count    2356.000000
mean       13.126486
std        45.876648
min         0.000000
25%        10.000000
50%        11.000000
75%        12.000000
max      1776.000000
Name: rating_numerator, dtype: float64

In [121]:

archive.rating_denominator.describe()

Out[121]:

count    2356.000000
mean       10.455433
std         6.745237
min         0.000000
25%        10.000000
50%        10.000000
75%        10.000000
max       170.000000
Name: rating_denominator, dtype: float64

predictions table

In [58]:

predictions

Out[58]:

tweet_idjpg_urlimg_nump1p1_confp1_dogp2p2_confp2_dogp3p3_confp3_dog
0666020888022790149https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg1Welsh_springer_spaniel0.465074Truecollie0.156665TrueShetland_sheepdog0.061428True
1666029285002620928https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg1redbone0.506826Trueminiature_pinscher0.074192TrueRhodesian_ridgeback0.072010True
2666033412701032449https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg1German_shepherd0.596461Truemalinois0.138584Truebloodhound0.116197True
3666044226329800704https://pbs.twimg.com/media/CT5Dr8HUEAA-lEu.jpg1Rhodesian_ridgeback0.408143Trueredbone0.360687Trueminiature_pinscher0.222752True
4666049248165822465https://pbs.twimg.com/media/CT5IQmsXIAAKY4A.jpg1miniature_pinscher0.560311TrueRottweiler0.243682TrueDoberman0.154629True
5666050758794694657https://pbs.twimg.com/media/CT5Jof1WUAEuVxN.jpg1Bernese_mountain_dog0.651137TrueEnglish_springer0.263788TrueGreater_Swiss_Mountain_dog0.016199True
6666051853826850816https://pbs.twimg.com/media/CT5KoJ1WoAAJash.jpg1box_turtle0.933012Falsemud_turtle0.045885Falseterrapin0.017885False
7666055525042405380https://pbs.twimg.com/media/CT5N9tpXIAAifs1.jpg1chow0.692517TrueTibetan_mastiff0.058279Truefur_coat0.054449False
8666057090499244032https://pbs.twimg.com/media/CT5PY90WoAAQGLo.jpg1shopping_cart0.962465Falseshopping_basket0.014594Falsegolden_retriever0.007959True
9666058600524156928https://pbs.twimg.com/media/CT5Qw94XAAA_2dP.jpg1miniature_poodle0.201493Truekomondor0.192305Truesoft-coated_wheaten_terrier0.082086True
10666063827256086533https://pbs.twimg.com/media/CT5Vg_wXIAAXfnj.jpg1golden_retriever0.775930TrueTibetan_mastiff0.093718TrueLabrador_retriever0.072427True
11666071193221509120https://pbs.twimg.com/media/CT5cN_3WEAAlOoZ.jpg1Gordon_setter0.503672TrueYorkshire_terrier0.174201TruePekinese0.109454True
12666073100786774016https://pbs.twimg.com/media/CT5d9DZXAAALcwe.jpg1Walker_hound0.260857TrueEnglish_foxhound0.175382TrueIbizan_hound0.097471True
13666082916733198337https://pbs.twimg.com/media/CT5m4VGWEAAtKc8.jpg1pug0.489814Truebull_mastiff0.404722TrueFrench_bulldog0.048960True
14666094000022159362https://pbs.twimg.com/media/CT5w9gUW4AAsBNN.jpg1bloodhound0.195217TrueGerman_shepherd0.078260Truemalinois0.075628True
15666099513787052032https://pbs.twimg.com/media/CT51-JJUEAA6hV8.jpg1Lhasa0.582330TrueShih-Tzu0.166192TrueDandie_Dinmont0.089688True
16666102155909144576https://pbs.twimg.com/media/CT54YGiWUAEZnoK.jpg1English_setter0.298617TrueNewfoundland0.149842Trueborzoi0.133649True
17666104133288665088https://pbs.twimg.com/media/CT56LSZWoAAlJj2.jpg1hen0.965932Falsecock0.033919Falsepartridge0.000052False
18666268910803644416https://pbs.twimg.com/media/CT8QCd1WEAADXws.jpg1desktop_computer0.086502Falsedesk0.085547Falsebookcase0.079480False
19666273097616637952https://pbs.twimg.com/media/CT8T1mtUwAA3aqm.jpg1Italian_greyhound0.176053Truetoy_terrier0.111884Truebasenji0.111152True
20666287406224695296https://pbs.twimg.com/media/CT8g3BpUEAAuFjg.jpg1Maltese_dog0.857531Truetoy_poodle0.063064Trueminiature_poodle0.025581True
21666293911632134144https://pbs.twimg.com/media/CT8mx7KW4AEQu8N.jpg1three-toed_sloth0.914671Falseotter0.015250Falsegreat_grey_owl0.013207False
22666337882303524864https://pbs.twimg.com/media/CT9OwFIWEAMuRje.jpg1ox0.416669FalseNewfoundland0.278407Truegroenendael0.102643True
23666345417576210432https://pbs.twimg.com/media/CT9Vn7PWoAA_ZCM.jpg1golden_retriever0.858744TrueChesapeake_Bay_retriever0.054787TrueLabrador_retriever0.014241True
24666353288456101888https://pbs.twimg.com/media/CT9cx0tUEAAhNN_.jpg1malamute0.336874TrueSiberian_husky0.147655TrueEskimo_dog0.093412True
25666362758909284353https://pbs.twimg.com/media/CT9lXGsUcAAyUFt.jpg1guinea_pig0.996496Falseskunk0.002402Falsehamster0.000461False
26666373753744588802https://pbs.twimg.com/media/CT9vZEYWUAAlZ05.jpg1soft-coated_wheaten_terrier0.326467TrueAfghan_hound0.259551Truebriard0.206803True
27666396247373291520https://pbs.twimg.com/media/CT-D2ZHWIAA3gK1.jpg1Chihuahua0.978108Truetoy_terrier0.009397Truepapillon0.004577True
28666407126856765440https://pbs.twimg.com/media/CT-NvwmW4AAugGZ.jpg1black-and-tan_coonhound0.529139Truebloodhound0.244220Trueflat-coated_retriever0.173810True
29666411507551481857https://pbs.twimg.com/media/CT-RugiWIAELEaq.jpg1coho0.404640Falsebarracouta0.271485Falsegar0.189945False
2045886366144734445568https://pbs.twimg.com/media/DE0BTnQUwAApKEH.jpg1French_bulldog0.999201TrueChihuahua0.000361TrueBoston_bull0.000076True
2046886680336477933568https://pbs.twimg.com/media/DE4fEDzWAAAyHMM.jpg1convertible0.738995Falsesports_car0.139952Falsecar_wheel0.044173False
2047886736880519319552https://pbs.twimg.com/media/DE5Se8FXcAAJFx4.jpg1kuvasz0.309706TrueGreat_Pyrenees0.186136TrueDandie_Dinmont0.086346True
2048886983233522544640https://pbs.twimg.com/media/DE8yicJW0AAAvBJ.jpg2Chihuahua0.793469Truetoy_terrier0.143528Truecan_opener0.032253False
2049887101392804085760https://pbs.twimg.com/media/DE-eAq6UwAA-jaE.jpg1Samoyed0.733942TrueEskimo_dog0.035029TrueStaffordshire_bullterrier0.029705True
2050887343217045368832https://pbs.twimg.com/ext_tw_video_thumb/88734…1Mexican_hairless0.330741Truesea_lion0.275645FalseWeimaraner0.134203True
2051887473957103951883https://pbs.twimg.com/media/DFDw2tyUQAAAFke.jpg2Pembroke0.809197TrueRhodesian_ridgeback0.054950Truebeagle0.038915True
2052887517139158093824https://pbs.twimg.com/ext_tw_video_thumb/88751…1limousine0.130432Falsetow_truck0.029175Falseshopping_cart0.026321False
2053887705289381826560https://pbs.twimg.com/media/DFHDQBbXgAEqY7t.jpg1basset0.821664Trueredbone0.087582TrueWeimaraner0.026236True
2054888078434458587136https://pbs.twimg.com/media/DFMWn56WsAAkA7B.jpg1French_bulldog0.995026Truepug0.000932Truebull_mastiff0.000903True
2055888202515573088257https://pbs.twimg.com/media/DFDw2tyUQAAAFke.jpg2Pembroke0.809197TrueRhodesian_ridgeback0.054950Truebeagle0.038915True
2056888554962724278272https://pbs.twimg.com/media/DFTH_O-UQAACu20.jpg3Siberian_husky0.700377TrueEskimo_dog0.166511Truemalamute0.111411True
2057888804989199671297https://pbs.twimg.com/media/DFWra-3VYAA2piG.jpg1golden_retriever0.469760TrueLabrador_retriever0.184172TrueEnglish_setter0.073482True
2058888917238123831296https://pbs.twimg.com/media/DFYRgsOUQAARGhO.jpg1golden_retriever0.714719TrueTibetan_mastiff0.120184TrueLabrador_retriever0.105506True
2059889278841981685760https://pbs.twimg.com/ext_tw_video_thumb/88927…1whippet0.626152Trueborzoi0.194742TrueSaluki0.027351True
2060889531135344209921https://pbs.twimg.com/media/DFg_2PVW0AEHN3p.jpg1golden_retriever0.953442TrueLabrador_retriever0.013834Trueredbone0.007958True
2061889638837579907072https://pbs.twimg.com/media/DFihzFfXsAYGDPR.jpg1French_bulldog0.991650Trueboxer0.002129TrueStaffordshire_bullterrier0.001498True
2062889665388333682689https://pbs.twimg.com/media/DFi579UWsAAatzw.jpg1Pembroke0.966327TrueCardigan0.027356Truebasenji0.004633True
2063889880896479866881https://pbs.twimg.com/media/DFl99B1WsAITKsg.jpg1French_bulldog0.377417TrueLabrador_retriever0.151317Truemuzzle0.082981False
2064890006608113172480https://pbs.twimg.com/media/DFnwSY4WAAAMliS.jpg1Samoyed0.957979TruePomeranian0.013884Truechow0.008167True
2065890240255349198849https://pbs.twimg.com/media/DFrEyVuW0AAO3t9.jpg1Pembroke0.511319TrueCardigan0.451038TrueChihuahua0.029248True
2066890609185150312448https://pbs.twimg.com/media/DFwUU__XcAEpyXI.jpg1Irish_terrier0.487574TrueIrish_setter0.193054TrueChesapeake_Bay_retriever0.118184True
2067890729181411237888https://pbs.twimg.com/media/DFyBahAVwAAhUTd.jpg2Pomeranian0.566142TrueEskimo_dog0.178406TruePembroke0.076507True
2068890971913173991426https://pbs.twimg.com/media/DF1eOmZXUAALUcq.jpg1Appenzeller0.341703TrueBorder_collie0.199287Trueice_lolly0.193548False
2069891087950875897856https://pbs.twimg.com/media/DF3HwyEWsAABqE6.jpg1Chesapeake_Bay_retriever0.425595TrueIrish_terrier0.116317TrueIndian_elephant0.076902False
2070891327558926688256https://pbs.twimg.com/media/DF6hr6BUMAAzZgT.jpg2basset0.555712TrueEnglish_springer0.225770TrueGerman_short-haired_pointer0.175219True
2071891689557279858688https://pbs.twimg.com/media/DF_q7IAWsAEuuN8.jpg1paper_towel0.170278FalseLabrador_retriever0.168086Truespatula0.040836False
2072891815181378084864https://pbs.twimg.com/media/DGBdLU1WsAANxJ9.jpg1Chihuahua0.716012Truemalamute0.078253Truekelpie0.031379True
2073892177421306343426https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg1Chihuahua0.323581TruePekinese0.090647Truepapillon0.068957True
2074892420643555336193https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg1orange0.097049Falsebagel0.085851Falsebanana0.076110False

2075 rows × 12 columns

Does this give me all of the images I want? Could I just use the ID’s to subset archive?

In [59]:

predictions.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2075 entries, 0 to 2074
Data columns (total 12 columns):
tweet_id    2075 non-null int64
jpg_url     2075 non-null object
img_num     2075 non-null int64
p1          2075 non-null object
p1_conf     2075 non-null float64
p1_dog      2075 non-null bool
p2          2075 non-null object
p2_conf     2075 non-null float64
p2_dog      2075 non-null bool
p3          2075 non-null object
p3_conf     2075 non-null float64
p3_dog      2075 non-null bool
dtypes: bool(3), float64(3), int64(2), object(4)
memory usage: 152.1+ KB

api_data table

In [31]:

api_data

Out[31]:

tweet_idretweet_countfavorite_countfollowers_count
08924206435553361938559386956989325
18921774213063434266293331686989325
28918151813780848644176249666989325
38916895572798586888683420806989325
48913275589266882569451402286989325
58910879508758978563127201756989325
68909719131739914262082118206989325
789072918141123788818984653746989325
88906091851503124484281277286989325
98902402553491988497453318736989325
108900066081131724807367305896989325
118898808964798668814993277276989325
1288966538833368268910110480216989325
138896388375799070724567271176989325
148895311353442099212243150666989325
158892788419816857605452252466989325
168889172381238312964516290166989325
178888049891996712974365255376989325
188885549627242782723597198606989326
198880784344585871363511217076989326
208877052893818265605417301176989326
2188751713915809382411719461326989326
2288747395710395188318314689606989326
2388734321704536883210451336026989326
248871013928040857605988304596989326
258869832335225446407809350746989326
268867368805193195523314120436989326
278866803364779335684489223676989326
288863661447344455683209211506989326
2988626700928501760041166989326
23156664115075514818573284486989510
2316666407126856765440411106989510
2317666396247373291520861666989510
2318666373753744588802931896989510
23196663627589092843535747796989510
2320666353288456101888732216989510
23216663454175762104321392986989510
2322666337882303524864921996989510
23236662939116321341443575096989510
2324666287406224695296661486989510
2325666273097616637952761756989511
2326666268910803644416351046989510
23276661041332886650886637143536989510
232866610215590914457613806989510
2329666099513787052032681566989510
2330666094000022159362741646989510
2331666082916733198337451196989510
23326660731007867740161643226989510
2333666071193221509120621486989510
23346660638272560865332204766989510
2335666058600524156928571126989510
23366660570904992440321422986989510
23376660555250424053802524346989510
233866605185382685081685312246989510
2339666050758794694657581326989511
2340666049248165822465411096989511
23416660442263298007041412986989511
2342666033412701032449451256989511
2343666029285002620928471296989510
234466602088802279014951725606989510

2345 rows × 4 columnsIn [33]:

api_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2345 entries, 0 to 2344
Data columns (total 4 columns):
tweet_id           2345 non-null int64
retweet_count      2345 non-null int64
favorite_count     2345 non-null int64
followers_count    2345 non-null int64
dtypes: int64(4)
memory usage: 73.4 KB

Data Inclusion Criteria

We are expected to use the following criteria to select the required data:

  • Do not include retweets
  • Only tweets that have images

An additional comment was made in student discussions: The archive also has reply tweets which in general contain upgraded/downgraded ratings of the dog. This means that in some cases there are two observations/ratings for the same dog. As a result, I decided to only include original ratings and so developed an additional criteria:

  • Do not include replies

Findings

Quality

archive table
  • Retweets are included in the dataset
  • Replies are included in the dataset
  • Erroneous datatypes (tweet_id, in_reply_to_status_id, in_reply_to_user_id, timestamp, retweeted_status_id, retweeted_status_user_id, retweeted_status_timestamp, doggo, floofer, pupper, and puppo columns)
  • Missing info in expanded_urls
  • Nulls represented as “None” (str) for namedoggoflooferpupper, and puppo columns
  • Missing counts for doggoflooferpupper and puppo
  • Missing names identified from text in name e.g. index 1852 – Reggie
  • Some names identified are not names
  • text column includes both text and short version of link
  • Second name missing if two are mentioned, e.g. index 2232 – Bo & Smittens
  • Some extracted values for rating_numerator and rating_denominator seem to be in error
predictions table
  • Erroneous datatype (tweet_id)
  • The lower number of entries means that some posts don’t have images
api_data table
  • Erroneous datatype (tweet_id)
  • Retweet and favorite information is not available for all tweets and cannot be retrieved

Tidiness

archive table
  • There are multiple columns containing the same type of data, e.g. doggoflooferpupper and puppo all contain dog types
predictions table
  • There are multiple columns containing the same type of data, e.g. p1p2p3 all contain dog breed predictions
api_data table
  • This data is separate from the other tweet data

Clean the Data

In [7]:

# Make copies to preserve the original datasets
archive_clean = archive.copy()
predictions_clean = predictions.copy()
api_data_clean = api_data.copy()

Missing Data

There are four areas of missing data identified:

  1. Missing info in expanded_urls
  2. Missing counts for doggoflooferpupper and puppo
  3. Missing names identified from text in name e.g. index 1852 – Reggie
  4. Second name missing if two are mentioned, e.g. index 2232 – Bo & Smittens

I am not concerned about tracking down the missing url information because I don’t plan to analyze it.

Missing counts for doggoflooferpupper and puppo in archive table

The issue of Nulls represented as “None” (str) for doggoflooferpupper, and puppo columns is also able to be addressed here.

Define

Use for loop and .str.contains() to re-identify if text contains each column header. Include text if it is found. If not, return NaN.

Code

In [8]:

dog_types = list(archive_clean.iloc[:,-4:])
dog_types

Out[8]:

['doggo', 'floofer', 'pupper', 'puppo']

In [9]:

def find_dog_type(df, dog_type):
    dog_list = []
    for row in df['text']:
        if dog_type in row:
            dog_list.append(dog_type)
        else:
            dog_list.append(np.NaN)
    return dog_list

In [10]:

for dog_type in dog_types:
    archive_clean[dog_type] = find_dog_type(archive, dog_type)

Resources:

Test

In [78]:

# Check non-null data counts for columns
archive_clean[dog_types].info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2356 entries, 0 to 2355
Data columns (total 4 columns):
doggo      98 non-null object
floofer    4 non-null object
pupper     272 non-null object
puppo      37 non-null object
dtypes: object(4)
memory usage: 73.7+ KB

In [45]:

# Compare to counts from text
for dog_type in dog_types:
    print(dog_type, archive_clean.text.str.contains(dog_type).sum())
doggo 98
floofer 4
pupper 272
puppo 37

The counts of what is found in the text strings matches what is found in the columns.

Missing names identified from text in name in archive table

The issues of Some names identified are not names and Nulls represented as “None” (str) for name can also be addressed here.

This would also be the place to address Second name missing if two are mentioned however I decided that this was too difficult for me to do.

Define

Create function to identify pet names and re-populate name column

Code

Pet names are capitalized, usually less than 10 characters but at least 2, and typically found before the first period. They typically only include letters and apostrophes and certain words are not usually used as pet names.In [11]:

def find_names(df):
    name_list = []
    for row in df['text']:
        # Find first "."
        first_period = row.find(".")
        # If no period is found, assume there is no name
        if first_period == -1:
            name_list.append(np.NaN)
        else:
            # Find word before period
            word_before = row[:first_period].rsplit(' ', 1)[-1]
            # Check if word is capitalized
            if word_before == word_before.title():
                # Add exclusionary criteria - not more than 10 letters, not less than 2 letters, not in other_words, only alphanumeric
                other_words = ["This", "Xbox", "Oh", "Christmas", "Up", "Pupper", "Doggo", "Puppo", "Floofer"]
                if (len(word_before) > 10) or any(word in word_before for word in other_words) or any(c for c in word_before if c not in string.ascii_letters + "'") or (len(word_before) < 2):
                    name_list.append(np.NaN)
                else:
                    name_list.append(word_before)
            else:
                name_list.append(np.NaN)
        
    return name_list

In [12]:

name_list = find_names(archive_clean)
archive_clean.name = name_list

Resources:

Test

In [101]:

# View names and NaNs
archive_clean.name.head(10)

Out[101]:

0     Phineas
1       Tilly
2      Archie
3       Darla
4    Franklin
5         NaN
6         Jax
7         NaN
8        Zoey
9      Cassie
Name: name, dtype: object

In [157]:

# Check value counts for unexpected names
archive_clean.name.value_counts()

Out[157]:

Charlie      14
Oliver       12
Cooper       11
Tucker       10
Lola         10
Lucy         10
Winston       9
Penny         9
Daisy         8
Bailey        7
Buddy         7
Bo            7
Toby          6
Scout         6
Sadie         6
Bella         6
Rusty         6
Stanley       6
Leo           6
Dave          6
Milo          6
Koda          6
Loki          5
Louis         5
Sophie        5
Jax           5
Gus           5
Larry         5
Ruby          5
Oscar         5
             ..
Jeffri        1
Lilah         1
Fwed          1
Bert          1
Rorie         1
Rinna         1
Pubert        1
Rooney        1
Kathmandu     1
Pumpkin       1
Darby         1
Apollo        1
Gustav        1
Banjo         1
Jeremy        1
Yoda          1
Duchess       1
Ivar          1
Hemry         1
Mookie        1
Dobby         1
Brudge        1
Sandra        1
Genevieve     1
Grady         1
Lillie        1
Tedrick       1
Leonard       1
Striker       1
Alfy          1
Name: name, Length: 961, dtype: int64

In [135]:

# Visually compare sample of results
archive_clean[['text', 'name']].sample(10)

Out[135]:

textname
2188This is Jeremy. He hasn’t grown into his skin …Jeremy
312Meet Lola. Her hobbies include being precious …Lola
2130This is Wally. He’s a Flaccid Mitochondria. Go…Wally
971Meet Lilah. She agreed on one quick pic. Now s…Lilah
2353Here is a very happy pup. Big fan of well-main…NaN
2280This is Fwed. He is a Canadian Asian Taylormad…Fwed
942This is Grizzie. She’s a semi-submerged Bahrai…Grizzie
732Idk why this keeps happening. We only rate dog…NaN
2178Super rare dog right here guys. Doesn’t bark. …NaN
322This is Sunshine. She doesn’t believe in perso…Sunshine

Tidy Data

The next step is to address tidiness issues. Three were identified:

  1. There are multiple columns containing the same type of data in the archive table, e.g. doggo, floofer, pupper, puppo
  2. There are multiple columns containing the same type of data in the predictions table, e.g. p1, p2, p3 all contain dog breed predictions
  3. The tweet data in the api_data table is separate from the other tweet data

Multiple columns containing the same type of data in the archive table

There is a small amount of overlap, but I would rather the posts be classified once.

Define

Create a column called dog_type and merge all data in order of puppopupperflooferdoggo using .fillna(). Drop the redundant columns.

Code

In [13]:

archive_clean['dog_type'] = archive_clean.puppo.fillna(archive_clean.pupper.fillna(archive_clean.floofer.fillna(archive_clean.doggo)))

In [14]:

archive_clean.drop(['doggo', 'floofer', 'pupper', 'puppo'], axis=1, inplace=True)
Test

In [146]:

# Confirm NaNs remain
archive_clean.dog_type.head(10)

Out[146]:

0      NaN
1      NaN
2      NaN
3      NaN
4      NaN
5      NaN
6      NaN
7      NaN
8      NaN
9    doggo
Name: dog_type, dtype: object

In [147]:

# Check dog_type counts
archive_clean.dog_type.value_counts()

Out[147]:

pupper     272
doggo       86
puppo       37
floofer      4
Name: dog_type, dtype: int64

The original count was:

  • doggo 98
  • floofer 4
  • pupper 272
  • puppo 37

So just lost 12 counts from doggo which seems acceptable to me.In [139]:

# Confirm column drop
archive_clean.columns

Out[139]:

Index(['tweet_id', 'in_reply_to_status_id', 'in_reply_to_user_id', 'timestamp',
       'source', 'text', 'retweeted_status_id', 'retweeted_status_user_id',
       'retweeted_status_timestamp', 'expanded_urls', 'rating_numerator',
       'rating_denominator', 'name', 'dog_type'],
      dtype='object')

Multiple columns containing the same type of data in the predictions table

Define

Change columns names for ease of use with pd.wide_to_long. Use pd.wide_to_long to

  • melt p1_confp2_confp3_conf to a confidence column
  • melt p1p2p3 to a prediction column
  • melt p1_dogp2_dogp3_dog to a dog column.
Code

In [15]:

# Change column names
col_names = ['tweet_id', 'jpg_url', 'img_num', 
             'prediction_1', 'confidence_1', 'dog_1', 
             'prediction_2', 'confidence_2', 'dog_2', 
             'prediction_3', 'confidence_3', 'dog_3']
predictions_clean.columns = col_names

In [16]:

# Convert wide to long
predictions_clean = pd.wide_to_long(predictions_clean, 
                                    stubnames=['prediction', 'confidence', 'dog'],
                      i=['tweet_id', 'jpg_url', 'img_num'], j='prediction_order', sep='_')\
.reset_index()
Test

In [189]:

# Visual inspection
predictions_clean.head(9)

Out[189]:

tweet_idjpg_urlimg_numprediction_orderpredictionconfidencedog
0666020888022790149https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg11Welsh_springer_spaniel0.465074True
1666020888022790149https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg12collie0.156665True
2666020888022790149https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg13Shetland_sheepdog0.061428True
3666029285002620928https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg11redbone0.506826True
4666029285002620928https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg12miniature_pinscher0.074192True
5666029285002620928https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg13Rhodesian_ridgeback0.072010True
6666033412701032449https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg11German_shepherd0.596461True
7666033412701032449https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg12malinois0.138584True
8666033412701032449https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg13bloodhound0.116197True

In [190]:

predictions_clean.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6225 entries, 0 to 6224
Data columns (total 7 columns):
tweet_id            6225 non-null int64
jpg_url             6225 non-null object
img_num             6225 non-null int64
prediction_order    6225 non-null object
prediction          6225 non-null object
confidence          6225 non-null float64
dog                 6225 non-null bool
dtypes: bool(1), float64(1), int64(2), object(3)
memory usage: 298.0+ KB

In [191]:

# Compare count to original counts
6225/2075

Out[191]:

3.0

Given that there are three predictions for each, it is expected that the length would increase by three times. This is what has occurred.

Resources:

Tweet data in the api_data table is separate from the other tweet data

Define

Merge the data from api_data with the archive table

Code

In [17]:

archive_clean = pd.merge(left=archive_clean, right=api_data_clean, how='left', on='tweet_id')
Test

In [194]:

archive_clean.head()

Out[194]:

tweet_idin_reply_to_status_idin_reply_to_user_idtimestampsourcetextretweeted_status_idretweeted_status_user_idretweeted_status_timestampexpanded_urlsrating_numeratorrating_denominatornamedog_typeretweet_countfavorite_countfollowers_count
0892420643555336193NaNNaN2017-08-01 16:23:56 +0000<a href=”http://twitter.com/download/iphone” r…This is Phineas. He’s a mystical boy. Only eve…NaNNaNNaNhttps://twitter.com/dog_rates/status/892420643…1310PhineasNaN8561.038696.06984446.0
1892177421306343426NaNNaN2017-08-01 00:17:27 +0000<a href=”http://twitter.com/download/iphone” r…This is Tilly. She’s just checking pup on you….NaNNaNNaNhttps://twitter.com/dog_rates/status/892177421…1310TillyNaN6295.033171.06984446.0
2891815181378084864NaNNaN2017-07-31 00:18:03 +0000<a href=”http://twitter.com/download/iphone” r…This is Archie. He is a rare Norwegian Pouncin…NaNNaNNaNhttps://twitter.com/dog_rates/status/891815181…1210ArchieNaN4176.024967.06984446.0
3891689557279858688NaNNaN2017-07-30 15:58:51 +0000<a href=”http://twitter.com/download/iphone” r…This is Darla. She commenced a snooze mid meal…NaNNaNNaNhttps://twitter.com/dog_rates/status/891689557…1310DarlaNaN8683.042087.06984446.0
4891327558926688256NaNNaN2017-07-29 16:00:24 +0000<a href=”http://twitter.com/download/iphone” r…This is Franklin. He would like you to stop ca…NaNNaNNaNhttps://twitter.com/dog_rates/status/891327558…1210FranklinNaN9453.040235.06984446.0

In [82]:

archive_clean.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2356 entries, 0 to 2355
Data columns (total 17 columns):
tweet_id                      2356 non-null int64
in_reply_to_status_id         78 non-null float64
in_reply_to_user_id           78 non-null float64
timestamp                     2356 non-null object
source                        2356 non-null object
text                          2356 non-null object
retweeted_status_id           181 non-null float64
retweeted_status_user_id      181 non-null float64
retweeted_status_timestamp    181 non-null object
expanded_urls                 2297 non-null object
rating_numerator              2356 non-null int64
rating_denominator            2356 non-null int64
name                          1531 non-null object
dog_type                      399 non-null object
retweet_count                 2345 non-null float64
favorite_count                2345 non-null float64
followers_count               2345 non-null float64
dtypes: float64(7), int64(3), object(7)
memory usage: 331.3+ KB

Data Quality

Some posts don’t have images

Define

Remove any tweet ids in the archive table that aren’t in the predictions table.

Code

In [18]:

# Confirm the number to be removed
no_image = (~archive_clean.tweet_id.isin(list(predictions_clean.tweet_id)))
no_image.sum()

Out[18]:

281

In [19]:

# Remove non-shared tweet_id's
archive_clean = archive_clean[~no_image]
Test

In [65]:

# Confirm no tweet_id's without images
(~archive_clean.tweet_id.isin(list(predictions_clean.tweet_id))).sum()

Out[65]:

0

In [86]:

# Confirm new archive_clean counts
archive_clean.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 2075 entries, 0 to 2355
Data columns (total 17 columns):
tweet_id                      2075 non-null int64
in_reply_to_status_id         23 non-null float64
in_reply_to_user_id           23 non-null float64
timestamp                     2075 non-null object
source                        2075 non-null object
text                          2075 non-null object
retweeted_status_id           81 non-null float64
retweeted_status_user_id      81 non-null float64
retweeted_status_timestamp    81 non-null object
expanded_urls                 2075 non-null object
rating_numerator              2075 non-null int64
rating_denominator            2075 non-null int64
name                          1422 non-null object
dog_type                      338 non-null object
retweet_count                 2069 non-null float64
favorite_count                2069 non-null float64
followers_count               2069 non-null float64
dtypes: float64(7), int64(3), object(7)
memory usage: 291.8+ KB

Replies and retweets are included in archive table

Define
  • Identify rows that have info for in_reply_to_status_id or retweeted_status_id and remove from archive_clean.
  • Remove redundant columns (in_reply_to_status_idin_reply_to_user_idretweeted_status_idretweeted_status_user_idretweeted_status_timestamp).
  • Remove non-shared id’s from predictions_clean
Code

In [20]:

# Check rows to remove for replies
replies = (~archive_clean.in_reply_to_status_id.isnull())
replies.sum()

Out[20]:

23

In [21]:

# Remove replies
archive_clean = archive_clean[~replies]

In [22]:

# Check rows to remove for retweets
retweets = (~archive_clean.retweeted_status_user_id.isnull())
retweets.sum()

Out[22]:

81

In [23]:

# Remove retweets
archive_clean = archive_clean[~retweets]

In [24]:

archive_clean.drop(['in_reply_to_status_id', 
                    'in_reply_to_user_id', 
                    'retweeted_status_id', 
                    'retweeted_status_user_id', 
                    'retweeted_status_timestamp'], axis=1, inplace=True)

In [25]:

# Identify tweet_ids in predictions not in archive
not_shared = (~predictions_clean.tweet_id.isin(list(archive_clean.tweet_id)))
not_shared.sum()

Out[25]:

312

This makes sense because it is 3 times 104 (the number of rows that were removed from archive_clean).In [26]:

predictions_clean = predictions_clean[~not_shared]
Test

In [95]:

# Confirm new archive_clean counts
archive_clean.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1971 entries, 0 to 2355
Data columns (total 12 columns):
tweet_id              1971 non-null int64
timestamp             1971 non-null object
source                1971 non-null object
text                  1971 non-null object
expanded_urls         1971 non-null object
rating_numerator      1971 non-null int64
rating_denominator    1971 non-null int64
name                  1367 non-null object
dog_type              322 non-null object
retweet_count         1971 non-null float64
favorite_count        1971 non-null float64
followers_count       1971 non-null float64
dtypes: float64(3), int64(3), object(6)
memory usage: 200.2+ KB

Note: removing all of the replies and retweets also removed all rows that didnt’ have the api_data information.In [83]:

2075 - (23 + 81)

Out[83]:

1971

The expected number of rows were removed and the columns were removed.In [98]:

# Confirm no unshared prection_clean tweet_id's with archive_clean
(~predictions_clean.tweet_id.isin(list(archive_clean.tweet_id))).sum()

Out[98]:

0
Define

Create a function to remove links and apply it to achive_clean.text.

Code

In [27]:

def remove_link(x):
        http_pos = x.find("http")
        # If no link, retain row
        if http_pos == -1:
            x = x
        else:
            # Remove space before link to end
            x = x[:http_pos - 1]
        return x

In [28]:

archive_clean.text = archive_clean.text.apply(remove_link)
Test

In [118]:

# Print full text to check endings
for row in archive_clean.text[:5]:
    print(row)
This is Phineas. He's a mystical boy. Only ever appears in the hole of a donut. 13/10
This is Tilly. She's just checking pup on you. Hopes you're doing ok. If not, she's available for pats, snugs, boops, the whole bit. 13/10
This is Archie. He is a rare Norwegian Pouncing Corgo. Lives in the tall grass. You never know when one may strike. 12/10
This is Darla. She commenced a snooze mid meal. 13/10 happens to the best of us
This is Franklin. He would like you to stop calling him "cute." He is a very fierce shark and should be respected as such. 12/10 #BarkWeek

Values for rating_numerator are incorrect

Define

Create a function that identifies the value before the last / in the text and uses this in the rating_numerator column. Manually correct any ratings that are not covered by the function.

Code

In [42]:

def find_numerator(x):
    # Ratings are associated with the last "/"
    slash = x.rfind("/")
    # Don't need to check for missing because original set only includes tweets with ratings
    # Most ratings are two digits, but if not, preceded by " ", "()" or "..."
    # Check for decimal
    try:
        if x[slash - 2] == ".":
            numerator = x[slash - 4:slash].strip()
            if numerator[0] == ".":
                numerator = numerator.strip("...").strip("..")
        else:
            numerator = x[slash - 2:slash].strip().strip("(")
        return float(numerator)
    # Manage strange formatting
    except ValueError:
        return np.NaN

In [43]:

archive_clean.rating_numerator = archive_clean.text.apply(find_numerator)

In [44]:

# Identify strange formatting
missing_numerator = list(archive_clean[archive_clean.rating_numerator.isnull()].index)
missing_numerator

Out[44]:

[2216, 2246]

In [42]:

# Check full text for each
for index in missing_numerator:
    print(index, archive_clean.text[index])
2216 This is Spark. He's nervous. Other dog hasn't moved in a while. Won't come when called. Doesn't fetch well 8/10&1/10
2246 This is Tedrick. He lives on the edge. Needs someone to hit the gas tho. Other than that he's a baller. 10&2/10

One contains two ratings and one is a humerous expression related to the picture. I’m going to go with 8 and 10In [45]:

archive_clean.at[missing_numerator[0], 'rating_numerator'] = 8
archive_clean.at[missing_numerator[1], 'rating_numerator'] = 10
Test

In [46]:

# Check all values are filled
archive_clean.rating_numerator.isnull().sum()

Out[46]:

0

In [85]:

# Check range of values
archive_clean.rating_numerator.describe()

Out[85]:

count    1971.000000
mean       10.893709
std         5.103397
min         0.000000
25%        10.000000
50%        11.000000
75%        12.000000
max        99.000000
Name: rating_numerator, dtype: float64

Values seem more inline with expectations (most over 10 but not many 15 and over)

Values for rating_denominator are incorrect

Define

Create a function that identifies the value after the last / in the text and uses this in the rating_denominator column.

Code

In [48]:

def find_denominator(x):
    # Ratings are associated with the last "/"
    slash = x.rfind("/")
    # Don't need to check for missing because original set only includes tweets with ratings
    # Expect denominator to be two digits
    try:
        denominator = x[slash + 1:slash + 3]
        return float(denominator)
    # Manage strange formatting
    except ValueError:
        return np.NaN

In [49]:

archive_clean.rating_denominator = archive_clean.text.apply(find_denominator)
Test

In [86]:

# Check all values are filled
archive_clean.rating_denominator.isnull().sum()

Out[86]:

0

In [87]:

# Check range of values
archive_clean.rating_denominator.describe()

Out[87]:

count    1971.000000
mean       10.203957
std         3.483537
min         7.000000
25%        10.000000
50%        10.000000
75%        10.000000
max        90.000000
Name: rating_denominator, dtype: float64

Most denominators are expected to be 10.

Erroneous datatypes

  • By melting the predictions table, an additional erroneous data type was created in the prediction_order column.
  • With the collapse of the columns in archive table to a single dog_type column, an additional erroneous data type was created in the column.
Define

archive_clean table:

  • tweet_id: change to str
  • timestamp: change to datetime
  • dog_type: categorical

predictions_clean table:

  • tweet_id: change to str
  • prediction_order: changet to categorical
Code

In [50]:

# Change tweet_id's
archive_clean.tweet_id = archive_clean.tweet_id.astype(str)
predictions_clean.tweet_id = predictions_clean.tweet_id.astype(str)

In [51]:

# Change timestamp
archive_clean.timestamp = pd.to_datetime(archive_clean.timestamp)

In [52]:

# Change dog_type and prediction order
archive_clean.dog_type = archive_clean.dog_type.astype("category")
predictions_clean.prediction_order = predictions_clean.prediction_order.astype("category")
Test

In [116]:

archive_clean.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1971 entries, 0 to 2355
Data columns (total 12 columns):
tweet_id              1971 non-null object
timestamp             1971 non-null datetime64[ns]
source                1971 non-null object
text                  1971 non-null object
expanded_urls         1971 non-null object
rating_numerator      1971 non-null float64
rating_denominator    1971 non-null float64
name                  1367 non-null object
dog_type              322 non-null category
retweet_count         1971 non-null float64
favorite_count        1971 non-null float64
followers_count       1971 non-null float64
dtypes: category(1), datetime64[ns](1), float64(5), object(5)
memory usage: 266.9+ KB

In [117]:

predictions_clean.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5913 entries, 0 to 6224
Data columns (total 7 columns):
tweet_id            5913 non-null object
jpg_url             5913 non-null object
img_num             5913 non-null int64
prediction_order    5913 non-null category
prediction          5913 non-null object
confidence          5913 non-null float64
dog                 5913 non-null bool
dtypes: bool(1), category(1), float64(1), int64(1), object(3)
memory usage: 288.8+ KB

Save Cleaned Data

In [118]:

archive_clean.to_csv('twitter_archive_master.csv', index=False)
predictions_clean.to_csv('predictions_master.csv', index=False)

Analyze and Visualize

In [54]:

archive = pd.read_csv('twitter_archive_master.csv')
predictions = pd.read_csv('predictions_master.csv')

In [55]:

archive.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1971 entries, 0 to 1970
Data columns (total 12 columns):
tweet_id              1971 non-null int64
timestamp             1971 non-null object
source                1971 non-null object
text                  1971 non-null object
expanded_urls         1971 non-null object
rating_numerator      1971 non-null float64
rating_denominator    1971 non-null float64
name                  1367 non-null object
dog_type              322 non-null object
retweet_count         1971 non-null float64
favorite_count        1971 non-null float64
followers_count       1971 non-null float64
dtypes: float64(5), int64(1), object(6)
memory usage: 184.9+ KB

In [56]:

predictions.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5913 entries, 0 to 5912
Data columns (total 7 columns):
tweet_id            5913 non-null int64
jpg_url             5913 non-null object
img_num             5913 non-null int64
prediction_order    5913 non-null int64
prediction          5913 non-null object
confidence          5913 non-null float64
dog                 5913 non-null bool
dtypes: bool(1), float64(1), int64(3), object(2)
memory usage: 283.0+ KB

All of the types have been lost with the conversion to and from csv so I need to re-run those.In [57]:

# Change types
archive.tweet_id = archive.tweet_id.astype(str)
predictions.tweet_id = predictions.tweet_id.astype(str)
archive.dog_type = archive.dog_type.astype("category")
predictions.prediction_order = predictions.prediction_order.astype("category")
archive.timestamp = pd.to_datetime(archive.timestamp)

In [125]:

pd.plotting.scatter_matrix(archive.iloc[:, 1:], figsize=(15, 15));

Retweet Counts

In [58]:

archive.retweet_count.describe()

Out[58]:

count     1971.000000
mean      2725.781329
std       4699.394366
min         13.000000
25%        608.500000
50%       1323.000000
75%       3127.500000
max      77141.000000
Name: retweet_count, dtype: float64

In [59]:

def set_my_palette():
    sns.set()
    current_palette = sns.color_palette(my_palette)
    sns.set_palette(current_palette)

In [60]:

my_palette = ['#66b3ff', '#00cc99', '#ff6666', '#ffff66', '#8c66ff', '#66ffd9']
set_my_palette()
archive.retweet_count.hist();

In [61]:

archive[archive.retweet_count <= 20000].retweet_count.hist();

In [62]:

archive[archive.retweet_count <= 2500].retweet_count.hist();

Favorites Count

In [63]:

archive.favorite_count.describe()

Out[63]:

count      1971.000000
mean       8880.210046
std       12590.668395
min          80.000000
25%        1938.500000
50%        4040.000000
75%       11164.500000
max      143023.000000
Name: favorite_count, dtype: float64

In [64]:

archive.favorite_count.hist();

In [65]:

archive[archive.favorite_count <= 40000].favorite_count.hist();

In [66]:

archive[archive.favorite_count <= 5000].favorite_count.hist();

In [67]:

archive.name.value_counts().head(10)

Out[67]:

Charlie    13
Oliver     11
Cooper     10
Lucy        9
Tucker      9
Daisy       8
Penny       8
Winston     8
Lola        7
Stanley     6
Name: name, dtype: int64

Over Time

Followers

In [68]:

plt.subplots(figsize=(15, 9))
plt.plot(archive.timestamp, archive.followers_count);

In [69]:

archive.followers_count.describe()

Out[69]:

count    1.971000e+03
mean     6.989401e+06
std      8.672666e+01
min      6.988654e+06
25%      6.989352e+06
50%      6.989386e+06
75%      6.989467e+06
max      6.989511e+06
Name: followers_count, dtype: float64

There are these strange spikes that don’t seem to make sense. Can probably subset to just remove them as they correct back to the original values. Want to keep above 6989200.In [70]:

follower_count = archive.query('followers_count > 6989200')

In [71]:

sns.set_context("talk")
plt.subplots(figsize=(12, 8))
plt.plot(follower_count.timestamp, follower_count.followers_count)
plt.ylim(6989200, 6989800)
plt.title('Are We In Trouble?\n', fontsize=18, weight='bold')
plt.xlabel('\nDate (YYYY-MM)', weight='bold')
plt.ylabel('Number of Followers\n', weight='bold');
plt.savefig('in-trouble.png')

Retweets

In [72]:

sns.set_context()
plt.subplots(figsize=(15, 9))
plt.plot(archive.timestamp, archive.retweet_count);

In [73]:

weekly_retweet = archive.groupby(pd.Grouper(key='timestamp', freq='1w'))['retweet_count'].sum()\
                    .reset_index().sort_values('timestamp')[:-1]

In [74]:

plt.subplots(figsize=(15, 9))
plt.plot(weekly_retweet.timestamp, weekly_retweet.retweet_count);

Resources:

Favorites

In [75]:

plt.subplots(figsize=(15, 9))
plt.plot(archive.timestamp, archive.favorite_count);

In [76]:

weekly_favorite = archive.groupby(pd.Grouper(key='timestamp', freq='1w'))['favorite_count'].sum()\
                    .reset_index().sort_values('timestamp')[:-1]

In [77]:

plt.subplots(figsize=(15, 9))
plt.plot(weekly_favorite.timestamp, weekly_favorite.favorite_count);

In [78]:

sns.set_context("talk")
plt.subplots(figsize=(14, 9))
plt.plot(weekly_retweet.timestamp, weekly_retweet.retweet_count, label="Weekly Retweets")
plt.plot(weekly_favorite.timestamp, weekly_favorite.favorite_count, label="Weekly Favorites")
plt.title('Their Love Increases\n', fontsize=18, weight='bold')
plt.xlabel('\nDate (YYYY-MM)', weight='bold')
plt.ylabel('Count\n', weight='bold')
plt.legend();
plt.savefig('love-increases.png')

Dog Types

In [79]:

dog_counts = archive.groupby('dog_type')['tweet_id'].count()
dog_counts

Out[79]:

dog_type
doggo       66
floofer      3
pupper     225
puppo       28
Name: tweet_id, dtype: int64

In [80]:

sns.set_context("talk")
plt.subplots(figsize=(12, 6))
plt.bar([1, 2, 3, 4], dog_counts, tick_label=['doggo', 'floofer', 'pupper', 'puppo'])
plt.title('Favorite Dogs?\n', fontsize=18, weight='bold')
plt.xlabel('\nDog Type', weight='bold')
plt.ylabel('Count\n', weight='bold');
plt.savefig('favorite-dogs.png')

In [81]:

# Set outlier style
flierprops = dict(marker='o', alpha=0.5, markeredgewidth=1)

plt.subplots(figsize=(14, 8))
plt.subplot(121)
sns.boxplot(x=archive.dog_type, y=archive.retweet_count, flierprops=flierprops, linewidth=1.5)
plt.title('Retweets\n', fontsize=18, weight='bold')
plt.xlabel('\nDog Type', weight='bold')
plt.ylabel('Count\n', weight='bold');

plt.subplot(122)
sns.boxplot(x=archive.dog_type, y=archive.favorite_count, flierprops=flierprops, linewidth=1.5)
plt.title('Favorites\n', fontsize=18, weight='bold')
plt.xlabel('\nDog Type', weight='bold')
plt.ylabel('');plt.savefig('boxplot.png')

Resources:

Highest Rated

Retweet

In [85]:

# Get index
ind = archive.retweet_count.nlargest(5).index
# Get details
high_retweet = archive[['tweet_id', 'text', 'name', 'retweet_count', 'favorite_count', 'rating_numerator', 'rating_denominator', 'dog_type']].iloc[ind]
high_retweet

Out[85]:

tweet_idtextnameretweet_countfavorite_countrating_numeratorrating_denominatordog_type
769744234799360020481Here’s a doggo realizing you can stand in a po…NaN77141.0127911.013.010.0doggo
397807106840509214720This is Stephan. He just wants to help. 13/10 …Stephan60929.0122715.013.010.0NaN
804739238157791694849Here’s a doggo blowing bubbles. It’s downright…NaN50727.073148.013.010.0doggo
306822872901745569793Here’s a super supportive puppo participating …NaN48967.0143023.013.010.0puppo
58879415818425184262This is Duddles. He did an attempt. 13/10 some…Duddles44476.0105712.013.010.0NaN

Two names, two not. All 13/10. Two doggo’s, one puppo.In [34]:

high_retweet.describe()

Out[34]:

retweet_countfavorite_countrating_numeratorrating_denominator
count5.0000005.0000005.05.0
mean56448.000000114501.80000013.010.0
std13041.31507926683.8879610.00.0
min44476.00000073148.00000013.010.0
25%48967.000000105712.00000013.010.0
50%50727.000000122715.00000013.010.0
75%60929.000000127911.00000013.010.0
max77141.000000143023.00000013.010.0

Get image urls from predictions.In [56]:

image = predictions[predictions.tweet_id == '744234799360020481']['jpg_url']
dups = image.duplicated()
image = image[~dups]
image.values[0]

Out[56]:

'https://pbs.twimg.com/ext_tw_video_thumb/744234667679821824/pu/img/1GaWmtJtdqzZV7jy.jpg'

In [58]:

url_list = []
for tweet_id in high_retweet.tweet_id:
    image = predictions[predictions.tweet_id == tweet_id]['jpg_url']
    dups = image.duplicated()
    image = image[~dups]
    image_url = image.values[0]
    url_list.append(image_url)
    
url_list

Out[58]:

['https://pbs.twimg.com/ext_tw_video_thumb/744234667679821824/pu/img/1GaWmtJtdqzZV7jy.jpg',
 'https://pbs.twimg.com/ext_tw_video_thumb/807106774843039744/pu/img/8XZg1xW35Xp2J6JW.jpg',
 'https://pbs.twimg.com/ext_tw_video_thumb/739238016737267712/pu/img/-tLpyiuIzD5zR1et.jpg',
 'https://pbs.twimg.com/media/C2tugXLXgAArJO4.jpg',
 'https://pbs.twimg.com/ext_tw_video_thumb/879415784908390401/pu/img/cX7XI1TnUsseGET5.jpg']

In [144]:

Image(url= url_list[0], width=150, height=150)

Out[144]:

In [75]:

print(high_retweet.text.loc[ind[0]])
Here's a doggo realizing you can stand in a pool. 13/10 enlightened af (vid by Tina Conrad)

In [68]:

Image(url= url_list[1], width=250, height=250)

Out[68]:

In [76]:

print(high_retweet.text.loc[ind[1]])
This is Stephan. He just wants to help. 13/10 such a good boy

In [69]:

Image(url= url_list[2], width=300, height=300)

Out[69]:

In [77]:

print(high_retweet.text.loc[ind[2]])
Here's a doggo blowing bubbles. It's downright legendary. 13/10 would watch on repeat forever (vid by Kent Duryee)

In [78]:

Image(url= url_list[3], width=300, height=300)

Out[78]:

In [79]:

print(high_retweet.text.loc[ind[3]])
Here's a super supportive puppo participating in the Toronto  #WomensMarch today. 13/10

In [80]:

Image(url= url_list[4], width=300, height=300)

Out[80]:

In [87]:

print(high_retweet.text.loc[ind[4]])
This is Duddles. He did an attempt. 13/10 someone help him (vid by Georgia Felici)

Almost all of the highest retweets have videos.

Favorite

In [88]:

# Get index
ind = archive.favorite_count.nlargest(5).index
# Get details
high_favorite = archive[['tweet_id', 'text', 'name', 'retweet_count', 'favorite_count', 'rating_numerator', 'rating_denominator', 'dog_type']].iloc[ind]
high_favorite

Out[88]:

tweet_idtextnameretweet_countfavorite_countrating_numeratorrating_denominatordog_type
306822872901745569793Here’s a super supportive puppo participating …NaN48967.0143023.013.010.0puppo
769744234799360020481Here’s a doggo realizing you can stand in a po…NaN77141.0127911.013.010.0doggo
108866450705531457537This is Jamesy. He gives a kiss to every other…Jamesy36296.0124101.013.010.0pupper
397807106840509214720This is Stephan. He just wants to help. 13/10 …Stephan60929.0122715.013.010.0NaN
58879415818425184262This is Duddles. He did an attempt. 13/10 some…Duddles44476.0105712.013.010.0NaN

In [89]:

high_favorite.tweet_id.isin(high_retweet.tweet_id)

Out[89]:

306     True
769     True
108    False
397     True
58      True
Name: tweet_id, dtype: bool

Only one isn’t sharedIn [91]:

image = predictions[predictions.tweet_id == high_favorite.tweet_id.loc[108]]['jpg_url']
dups = image.duplicated()
image = image[~dups]
image_url = image.values[0]

In [92]:

Image(url= image_url, width=300, height=300)

Out[92]:

In [96]:

print(high_favorite.text.loc[108])
This is Jamesy. He gives a kiss to every other pupper he sees on his walk. 13/10 such passion, much tender

Rating to Retweet or Favorite

In [82]:

sns.set_context()
plt.scatter(archive.rating_numerator, archive.retweet_count);

Rating under 17 and log transform for retweetsIn [83]:

ratings_df = archive.query('rating_numerator <= 17').copy()
ratings_df.retweet_count = ratings_df.retweet_count.transform(lambda x: np.log10(x))
ratings_df.favorite_count = ratings_df.favorite_count.transform(lambda x: np.log10(x))

In [84]:

sns.set_context("talk")
plt.subplots(figsize=(14, 8))
plt.subplot(121)
sns.regplot(x='rating_numerator', 
            y='retweet_count', 
            data=ratings_df, 
            fit_reg=False, 
            x_jitter=0.25, 
            scatter_kws={'alpha': 0.2, 's': 30}, 
            color=my_palette[1])
plt.title('Retweets\n', fontsize=18, weight='bold')
plt.xlabel('\nNumerator', weight='bold')
plt.ylabel('Count (log10)', weight='bold');

plt.subplot(122)
sns.regplot(x='rating_numerator', 
            y='favorite_count', 
            data=ratings_df, 
            fit_reg=False, 
            x_jitter=0.25, 
            scatter_kws={'alpha': 0.2, 's': 30}, 
            color=my_palette[-2])
plt.title('Favorites\n', fontsize=18, weight='bold')
plt.xlabel('\nNumerator', weight='bold')
plt.ylabel('', weight='bold');
plt.savefig('ratings.png')

Resources:

Predictions

How Confident

In [156]:

confidence = predictions.groupby('prediction_order')['confidence']

In [150]:

confidence.mean()

Out[150]:

prediction_order
1    0.594558
2    0.134585
3    0.060166
Name: confidence, dtype: float64

In [151]:

confidence.median()

Out[151]:

prediction_order
1    0.587764
2    0.117397
3    0.049444
Name: confidence, dtype: float64

In [152]:

confidence.std()

Out[152]:

prediction_order
1    0.272126
2    0.101053
3    0.050942
Name: confidence, dtype: float64

In [153]:

confidence.mean() - confidence.std()

Out[153]:

prediction_order
1    0.322431
2    0.033532
3    0.009224
Name: confidence, dtype: float64

In [154]:

confidence.mean() + confidence.std()

Out[154]:

prediction_order
1    0.866684
2    0.235638
3    0.111107
Name: confidence, dtype: float64

In [92]:

sns.FacetGrid(predictions, col="prediction_order", hue="prediction_order", palette=my_palette[3:], size=4)\
    .map(plt.hist, "confidence")\
    .set_titles("Prediction {col_name}\n", weight='bold', fontsize=14)\
    .set_axis_labels("\nConfidence Rating", "Count\n");
plt.savefig('confidence.png')

Samples

In [189]:

samples = predictions.query('prediction_order == 1').sample(5)
samples

Out[189]:

tweet_idjpg_urlimg_numprediction_orderpredictionconfidencedog
5106829141528400556032https://pbs.twimg.com/media/C4GzztSWAAA_qi4.jpg21golden_retriever0.573140True
5763881536004380872706https://pbs.twimg.com/ext_tw_video_thumb/88153…11Samoyed0.281463True
5112829449946868879360https://pbs.twimg.com/media/C4LMUf8WYAkWz4I.jpg11Labrador_retriever0.315163True
1233674014384960745472https://pbs.twimg.com/media/CVqUgTIUAAUA8Jr.jpg11Pembroke0.742320True
753670733412878163972https://pbs.twimg.com/media/CU7seitWwAArlVy.jpg11dhole0.350416False
First

In [190]:

Image(url=samples.jpg_url.iloc[0], width=300, height=300)

Out[190]:

In [191]:

predictions[predictions.tweet_id == samples.tweet_id.iloc[0]]

Out[191]:

tweet_idjpg_urlimg_numprediction_orderpredictionconfidencedog
5106829141528400556032https://pbs.twimg.com/media/C4GzztSWAAA_qi4.jpg21golden_retriever0.573140True
5107829141528400556032https://pbs.twimg.com/media/C4GzztSWAAA_qi4.jpg22cocker_spaniel0.111159True
5108829141528400556032https://pbs.twimg.com/media/C4GzztSWAAA_qi4.jpg23gibbon0.094127False

Spot on!

Second

In [192]:

Image(url=samples.jpg_url.iloc[1], width=300, height=300)

Out[192]:

In [193]:

predictions[predictions.tweet_id == samples.tweet_id.iloc[1]]

Out[193]:

tweet_idjpg_urlimg_numprediction_orderpredictionconfidencedog
5763881536004380872706https://pbs.twimg.com/ext_tw_video_thumb/88153…11Samoyed0.281463True
5764881536004380872706https://pbs.twimg.com/ext_tw_video_thumb/88153…12Angora0.272066False
5765881536004380872706https://pbs.twimg.com/ext_tw_video_thumb/88153…13Persian_cat0.114854False

Nice! of a pup’s behind!

Third

In [194]:

Image(url=samples.jpg_url.iloc[2], width=300, height=300)

Out[194]:

In [195]:

predictions[predictions.tweet_id == samples.tweet_id.iloc[2]]

Out[195]:

tweet_idjpg_urlimg_numprediction_orderpredictionconfidencedog
5112829449946868879360https://pbs.twimg.com/media/C4LMUf8WYAkWz4I.jpg11Labrador_retriever0.315163True
5113829449946868879360https://pbs.twimg.com/media/C4LMUf8WYAkWz4I.jpg12golden_retriever0.153210True
5114829449946868879360https://pbs.twimg.com/media/C4LMUf8WYAkWz4I.jpg13Pekinese0.132791True

Damn, even with a hat!

Fourth

In [196]:

Image(url=samples.jpg_url.iloc[3], width=300, height=300)

Out[196]:

In [197]:

predictions[predictions.tweet_id == samples.tweet_id.iloc[3]]

Out[197]:

tweet_idjpg_urlimg_numprediction_orderpredictionconfidencedog
1233674014384960745472https://pbs.twimg.com/media/CVqUgTIUAAUA8Jr.jpg11Pembroke0.742320True
1234674014384960745472https://pbs.twimg.com/media/CVqUgTIUAAUA8Jr.jpg12Cardigan0.084937True
1235674014384960745472https://pbs.twimg.com/media/CVqUgTIUAAUA8Jr.jpg13Eskimo_dog0.068321True

Not just corgi, PEMBROKE corgi.

Fifth

In [198]:

Image(url=samples.jpg_url.iloc[4], width=300, height=300)

Out[198]:

In [199]:

predictions[predictions.tweet_id == samples.tweet_id.iloc[4]]

Out[199]:

tweet_idjpg_urlimg_numprediction_orderpredictionconfidencedog
753670733412878163972https://pbs.twimg.com/media/CU7seitWwAArlVy.jpg11dhole0.350416False
754670733412878163972https://pbs.twimg.com/media/CU7seitWwAArlVy.jpg12hare0.236661False
755670733412878163972https://pbs.twimg.com/media/CU7seitWwAArlVy.jpg13wood_rabbit0.091133False

Squirrel…

So not quite in the end, but definitely in the right vicinity.

Author

Write A Comment