-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathusing-tweepy.html
More file actions
446 lines (412 loc) · 30.6 KB
/
using-tweepy.html
File metadata and controls
446 lines (412 loc) · 30.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
<!DOCTYPE HTML>
<!--
Dopetrope 2.0 by HTML5 UP
html5up.net | @n33co
Free for personal and commercial use under the CCA 3.0 license (html5up.net/license)
-->
<html>
<head>
<title>Three.Fourteen</title>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta charset="utf-8" />
<meta property="og:site_name" content="Three.Fourteen" />
<meta property="og:type" content="article"/>
<meta property="og:title" content="Best Practices for Querying Twitter API using (forked) tweepy"/>
<meta property="og:url" content="http://www.nirg.net/using-tweepy.html"/>
<meta property="og:description" content="Over the past few weeks I have been accessing Twitter API from python using tweepy. In the course of doing so, I added support in tweepy for using multiple authentication handlers, monitoring rate limits, waiting when running out of calls, support for search using Twitter API v1.1, proper pagination …"/>
<meta property="article:published_time" content="2013-04-28" />
<meta property="article:section" content="Uncategorized" />
<meta property="article:author" content="grinbergnir" />
<meta property="og:image"
content="http://www.nirg.net/images/twython1.jpg"/>
<meta name="twitter:dnt" content="on">
<meta name="twitter:card" content="summary">
<meta name="twitter:domain" content=".">
<meta property="twitter:image"
content="http://www.nirg.net/images/twython1.jpg"/>
<link href="http://fonts.googleapis.com/css?family=Source+Sans+Pro:300,400,700,900,300italic" rel="stylesheet" />
<link rel="stylesheet" href="/theme/css/pygment.css" />
<noscript>
<link rel="stylesheet" href="/theme/css/skel-noscript.css" />
<link rel="stylesheet" href="/theme/css/style.css" />
<link rel="stylesheet" href="/theme/css/style-desktop.css" />
</noscript>
<!-- <script>var _gaq=[['_setAccount','UA-49475310-1'],['_trackPageview']];(function(d,t){var g=d.createElement(t),s=d.getElementsByTagName(t)[0];g.src='//www.google-analytics.com/ga.js';s.parentNode.insertBefore(g,s)}(document,'script'))</script> -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-49475310-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-49475310-1');
</script>
<link rel="shortcut icon" href="/images/favicon.ico" type="image/x-icon">
<link rel="icon" href="/images/favicon.ico" type="image/x-icon">
</head>
<body class="no-sidebar" id="Blog">
<!-- Header Wrapper -->
<div id="header-wrapper">
<div class="container">
<div class="row">
<div class="12u">
<!-- Header -->
<section id="header">
<!-- Logo -->
<h1><a href=".">Three.Fourteen</a></h1>
<!-- Nav -->
<nav id="nav">
<ul>
<li><a href="/index.html">Home</a></li>
<li class="current_page_item"><a href="/blog.html">Blog</a></li>
<li><a href="./pages/about.html">About</a></li>
<li><a href="./pages/publications.html">Publications</a></li>
</ul>
</nav>
</section>
</div>
</div>
</div>
</div>
<!-- Main Wrapper -->
<div id="main-wrapper" style="padding-top: 3em">
<div class="container">
<div class="row">
<div class="12u">
<!-- Portfolio -->
<section>
<div>
<div class="row">
<div class="12u skel-cell-mainContent">
<!-- Content -->
<article class="box is-post">
<center><h2>Best Practices for Querying Twitter API using (forked) tweepy</h2></center>
<a href="#" class="image image-centered"><img src="images/twython1.jpg" style="border: 1px solid rgba(88,88,88,0.2);max-width: 100%;max-height: 100%;" /></a>
Published on April 28, 2013.
<p>Over the past few weeks I have been accessing Twitter API from python
using tweepy. In the course of doing so, I added support in tweepy for
using multiple authentication handlers, monitoring rate limits, waiting
when running out of calls, support for search using Twitter API v1.1,
proper pagination of results and more. In this blog post I will describe
the major changes in my forked version of the tweepy package and provide
some best practices and examples for querying the Twitter API robustly.
Fork me on <a href="http://github.com/nirg/tweepy">GitHub</a> or follow my pull request on the main tweepy repo
<a href="http://github.com/tweepy/tweepy/pull/282">#282</a> to get the complete code referenced in this post.</p>
<h2>New Features</h2>
<ul>
<li>
<p><strong>Support for multiple authentication handlers</strong>: sometimes you just
need more than a single authentication handler. You can provide a
list of authentication handlers to tweepy and manually switch
between them: </p>
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">tweepy</span>
<span class="c1"># create authentication handlers given pre-existing keys </span>
<span class="n">auths</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">consumer_key</span><span class="p">,</span> <span class="n">consumer_secret</span><span class="p">,</span> <span class="n">access_key</span><span class="p">,</span> <span class="n">access_secret</span> <span class="ow">in</span> <span class="n">oauth_keys</span><span class="p">:</span>
<span class="n">auth</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">OAuthHandler</span><span class="p">(</span><span class="n">consumer_key</span><span class="p">,</span> <span class="n">consumer_secret</span><span class="p">)</span>
<span class="n">auth</span><span class="o">.</span><span class="n">set_access_token</span><span class="p">(</span><span class="n">access_key</span><span class="p">,</span> <span class="n">access_secret</span><span class="p">)</span>
<span class="n">auths</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">auth</span><span class="p">)</span>
<span class="n">api</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">API</span><span class="p">(</span><span class="n">auths</span><span class="p">)</span>
<span class="c1"># switch to the second authentication handler (zero-based index) </span>
<span class="n">api</span><span class="o">.</span><span class="n">auth_idx</span> <span class="o">=</span> <span class="mi">1</span>
</pre></div>
</li>
<li>
<p><strong>Monitoring rate limit</strong>: switching authentication handlers
manually is nice, but it is far more useful to have tweepy do that
automatically based on the number of remaining api calls.
The <code>monitor_rate_limit</code> argument does exactly that - monitor to
number of remaining calls to the Twitter API endpoint and switch to
the next authentication handler when running out of calls. This
feature lets you be a good citizen and not supersede your API calls
quota. </p>
<div class="highlight"><pre><span></span><span class="n">api</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">API</span><span class="p">(</span><span class="n">auths</span><span class="p">,</span> <span class="n">monitor_rate_limit</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</pre></div>
</li>
<li>
<p><strong>Blocking when running out of calls</strong>: using the argument
wait_on_rate_limit you can have tweepy wait when running out of
calls. By monitoring the http headers returned from Twitter, tweepy
now knows exactly how long to wait until the next lump of api calls
is available. This is useful when you need more than the number of
calls allowed by the rate limit, but don't want your program to fail
with an exception or lose track of the current position in your
results. </p>
<div class="highlight"><pre><span></span><span class="n">api</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">API</span><span class="p">(</span><span class="n">auths</span><span class="p">,</span>
<span class="n">monitor_rate_limit</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">wait_on_rate_limit</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</pre></div>
</li>
<li>
<p><strong>Proper iteration of result pages (pagination)</strong>: prior to my
forked version of tweepy, if you used the Cursor object to iterate
over results you probably have missed some tweets in your result
set. A good explanation of why this has happened is in Twitter own
documentation on <a href="https://dev.twitter.com/docs/working-with-timelines">working with timelines</a>. Now iteration uses
<code>max_id</code> parameter to properly traverse results when using the
cursor object. If you're wondering what's happening behind the
scenes, take a look at <code>MaxIdIterator</code> class in <code>tweepy/cursor.py</code>.</p>
</li>
<li><strong>Search endpoint now supports Twitter API v1.1</strong>.</li>
<li><strong>Unit testing for all of the above</strong>.</li>
</ul>
<h2>Best Practices</h2>
<ul>
<li>
<p><strong>Retrying on errors</strong>: little know fact, unless you don't dig into
the tweepy code, is that you can have tweepy retry making calls for
you when certain error codes are returned. The retry feature is
useful for handling temporary failures. You specify how many retries
will be performed, delay in seconds between calls and the error
response codes upon which tweepy will retry making calls. For a
complete list of error response codes, see the Twitter API
documentation on <a href="https://dev.twitter.com/docs/error-codes-responses" title="Error Codes & Responses">Error Codes & Responses</a>. Here is an example of
using the retry feature: </p>
<div class="highlight"><pre><span></span><span class="n">api</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">API</span><span class="p">(</span><span class="n">auths</span><span class="p">,</span> <span class="n">retry_count</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">retry_delay</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>
<span class="n">retry_errors</span><span class="o">=</span><span class="nb">set</span><span class="p">([</span><span class="mi">401</span><span class="p">,</span> <span class="mi">404</span><span class="p">,</span> <span class="mi">500</span><span class="p">,</span> <span class="mi">503</span><span class="p">]))</span>
</pre></div>
</li>
<li>
<p><strong>How to use pagination with search?</strong> here is the example from
<code>examples/paginated_search.py</code>: </p>
<div class="highlight"><pre><span></span><span class="n">auth</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">OAuthHandler</span><span class="p">(</span><span class="n">consumer_key</span><span class="p">,</span> <span class="n">consumer_secret</span><span class="p">)</span>
<span class="n">auth</span><span class="o">.</span><span class="n">set_access_token</span><span class="p">(</span><span class="n">access_token</span><span class="p">,</span> <span class="n">access_token_secret</span><span class="p">)</span>
<span class="n">api</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">API</span><span class="p">([</span><span class="n">auth</span><span class="p">],</span>
<span class="c1"># support for multiple authentication handlers </span>
<span class="c1"># retry 3 times with 5 seconds delay when getting these error codes</span>
<span class="c1"># For more details see </span>
<span class="c1"># https://dev.twitter.com/docs/error-codes-responses </span>
<span class="n">retry_count</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span><span class="n">retry_delay</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span><span class="n">retry_errors</span><span class="o">=</span><span class="nb">set</span><span class="p">([</span><span class="mi">401</span><span class="p">,</span> <span class="mi">404</span><span class="p">,</span> <span class="mi">500</span><span class="p">,</span> <span class="mi">503</span><span class="p">]),</span>
<span class="c1"># monitor remaining calls and block until replenished </span>
<span class="n">monitor_rate_limit</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">wait_on_rate_limit</span><span class="o">=</span><span class="bp">True</span>
<span class="p">)</span>
<span class="n">query</span> <span class="o">=</span> <span class="s1">'cupcake OR donut'</span>
<span class="n">page_count</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">tweets</span> <span class="ow">in</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">Cursor</span><span class="p">(</span><span class="n">api</span><span class="o">.</span><span class="n">search</span><span class="p">,</span> <span class="n">q</span><span class="o">=</span><span class="n">query</span><span class="p">,</span> <span class="n">count</span><span class="o">=</span><span class="mi">100</span><span class="p">,</span>
<span class="n">result_type</span><span class="o">=</span><span class="s2">"recent"</span><span class="p">,</span> <span class="n">include_entities</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span><span class="o">.</span><span class="n">pages</span><span class="p">():</span>
<span class="n">page_count</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="c1"># print just the first tweet out of every page of 100 tweets </span>
<span class="k">print</span> <span class="n">tweets</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">text</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="s1">'utf-8'</span><span class="p">)</span>
<span class="c1"># stop after retrieving 200 pages </span>
<span class="k">if</span> <span class="n">page_count</span> <span class="o">>=</span> <span class="mi">200</span><span class="p">:</span>
<span class="k">break</span>
</pre></div>
</li>
<li>
<p><strong>Response as python dictionary</strong>: sometimes you don't want tweepy
to spend time on parsing json into dedicated python object and have
the results simply as a dictionary. The example at
<code>examples/json_results.py</code> demonstrate querying a user timeline, tim
oreilly in this case, using a json payload type: </p>
<div class="highlight"><pre><span></span><span class="n">auth</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">OAuthHandler</span><span class="p">(</span><span class="n">consumer_key</span><span class="p">,</span> <span class="n">consumer_secret</span><span class="p">)</span>
<span class="n">auth</span><span class="o">.</span><span class="n">set_access_token</span><span class="p">(</span><span class="n">access_token</span><span class="p">,</span> <span class="n">access_token_secret</span><span class="p">)</span>
<span class="n">api</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">API</span><span class="p">([</span><span class="n">auth</span><span class="p">])</span>
<span class="n">json_user_timeline</span> <span class="o">=</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">binder</span><span class="o">.</span><span class="n">bind_api</span><span class="p">(</span>
<span class="n">path</span> <span class="o">=</span> <span class="s1">'/statuses/user_timeline.json'</span><span class="p">,</span>
<span class="n">payload_type</span> <span class="o">=</span> <span class="s1">'json'</span><span class="p">,</span> <span class="n">payload_list</span> <span class="o">=</span> <span class="bp">True</span><span class="p">,</span>
<span class="n">allowed_param</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'id'</span><span class="p">,</span> <span class="s1">'user_id'</span><span class="p">,</span> <span class="s1">'screen_name'</span><span class="p">,</span> <span class="s1">'since_id'</span><span class="p">,</span>
<span class="s1">'max_id'</span><span class="p">,</span> <span class="s1">'count'</span><span class="p">,</span> <span class="s1">'page'</span><span class="p">,</span> <span class="s1">'include_rts'</span><span class="p">])</span>
<span class="c1"># iterate over 50 of tim oreilly's tweets </span>
<span class="k">for</span> <span class="n">tweet</span> <span class="ow">in</span> <span class="n">tweepy</span><span class="o">.</span><span class="n">Cursor</span><span class="p">(</span><span class="n">json_user_timeline</span><span class="p">,</span> <span class="n">api</span><span class="p">,</span>
<span class="n">screen_name</span><span class="o">=</span><span class="s1">'timoreilly'</span><span class="p">,</span> <span class="n">count</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">include_rts</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span><span class="o">.</span><span class="n">items</span><span class="p">(</span><span class="mi">50</span><span class="p">):</span>
<span class="k">print</span> <span class="n">tweet</span><span class="p">[</span><span class="s1">'text'</span><span class="p">]</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="s1">'utf-8'</span><span class="p">)</span>
</pre></div>
</li>
</ul>
<p>Have a comment? an idea of how to do things more elegantly? leave a comment or contact me on twitter @grinbergnir.</p>
<style type="text/css"> #ssba {
padding: 4px;
border: 1px solid #e0dddd;
background-color: #42a7e2;
-moz-border-radius: 10px; -webkit-border-radius: 10px; -khtml-border-radius: 10px; border-radius: 10px; -o-border-radius: 10px;
}
#ssba img
{
width: 35px !important;
padding: 4px;
border: 0;
box-shadow: none !important;
display: inline;
vertical-align: middle;
}
#ssba, #ssba a
{
;
font-family: Indie Flower;
font-size: 20px;
color: #ffffff!important;
font-weight: bold;
}</style>
<div id="ssba"> Share!
<a id="ssba_twitter_share" href="http://twitter.com/intent/tweet?text=Love%20this%20post%20by%40grinbergnir%20./using-tweepy.html" target="_blank"><img title="Twitter" class="ssba" src="/theme/images/twitter.png" alt="Twitter"></a>
<a id="ssba_facebook_share" href="http://www.facebook.com/sharer.php?u=./using-tweepy.html" target="_blank"><img title="Facebook" class="ssba" src="/theme/images/facebook.png" alt="Facebook"></a>
<a id="ssba_linkedin_share" class="ssba_share_link" href="http://www.linkedin.com/shareArticle?mini=true&url=./using-tweepy.html" target="_blank"><img title="LinkedIn" class="ssba" src="/theme/images/linkedin.png" alt="LinkedIn"></a>
<a id="ssba_google_share" href="https://plus.google.com/share?url=./using-tweepy.html" target="_blank"><img title="Google+" class="ssba" src="/theme/images/google.png" alt="Google+"></a>
<a id="ssba_pinterest_share" href="javascript:void((function()%7Bvar%20e=document.createElement('script');e.setAttribute('type','text/javascript');e.setAttribute('charset','UTF-8');e.setAttribute('src','http://assets.pinterest.com/js/pinmarklet.js?r='+Math.random()*99999999);document.body.appendChild(e)%7D)());"><img title="Pinterest" class="ssba" src="/theme/images/pinterest.png" alt="Pinterest"></a>
<a id="ssba_reddit_share" href="http://reddit.com/submit?url=./using-tweepy.html&title=Love%20this%20post%20by%20%40grinbergnir" target="_blank"><img title="Reddit" class="ssba" src="/theme/images/reddit.png" alt="Reddit"></a><a id="ssba_diggit_share" class="ssba_share_link" href="http://www.digg.com/submit?url=./using-tweepy.html" target="_blank"><img title="Digg" class="ssba" src="/theme/images/diggit.png" alt="Digg"></a>
<a id="ssba_stumbleupon_share" class="ssba_share_link" href="http://www.stumbleupon.com/submit?url=./using-tweepy.html&title=Love%20this%20post%20by%20%40grinbergnir" target="_blank"><img title="StumbleUpon" class="ssba" src="/theme/images/stumbleupon.png" alt="StumbleUpon"></a>
<a id="ssba_tumblr_share" href="http://www.tumblr.com/share/link?url=./using-tweepy.html&name=Best%20Practices%20for%20Querying%20Twitter%20API%20using%20%28forked%29%20tweepy" target="_blank"><img title="tumblr" class="ssba" src="/theme/images/tumblr1.png" alt="tumblr"></a>
<a id="ssba_email_share" href="mailto:?subject=Love%20this%20post%20by%40grinbergnir&body=./using-tweepy.html" target="_blank"><img title="Email" class="ssba" src="/theme/images/email.png" alt="Email"></a>
</div>
<!--<p id="post-share-links">
Share on:
<a href="http://twitter.com/intent/tweet?text=Love%20this%20post%20by%40grinbergnir%20./using-tweepy.html" target="_blank" title="Share on Twitter">Twitter</a>
❄
<a href="http://www.facebook.com/sharer/sharer.php?u=./using-tweepy.html" target="_blank" title="Share via Email">Facebook</a>
❄
<a href="https://plus.google.com/share?url=./using-tweepy.html" target="_blank" title="Share via Email">Google+</a>
❄
<a href="http://www.linkedin.com/shareArticle?mini=true&url=./using-tweepy.html" target="_blank" title="Share via Email">Linkedin</a>
❄
<a href="mailto:?subject=Love%20this%20post%20by%40grinbergnir&body=./using-tweepy.html" target="_blank" title="Share via Email">Email</a>
</p>-->
<h2>Comments</h2>
<div id="disqus_thread"></div>
<script type="text/javascript">
/* * * CONFIGURATION VARIABLES: EDIT BEFORE PASTING INTO YOUR WEBPAGE * * */
var disqus_shortname = 'threefourteen'; // required: replace example with your forum shortname
/* * * DON'T EDIT BELOW THIS LINE * * */
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript'; dsq.async = true;
dsq.src = '//' + disqus_shortname + '.disqus.com/embed.js';
(document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
</script>
<noscript>Please enable JavaScript to view the <a href="http://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
<a href="http://disqus.com" class="dsq-brlink">comments powered by <span class="logo-disqus">Disqus</span></a>
</article>
</div>
</div>
</div>
</section>
</div>
</div>
</div>
</div>
<!-- Footer Wrapper -->
<div id="footer-wrapper">
<!-- Footer -->
<section id="footer" class="container">
<div class="row">
<div class="8u">
<section>
<header>
<h2>Latest posts</h2>
</header>
<ul class="dates">
<li>
<span class="date">Apr <strong>18</strong></span>
<h3><a href="eng-modes-news.html">Modes of Enagement with News and their Relationship to Information Gain in Text</a></h3>
<p><p>How do people engage with news articles post-click? How does the development of ideas within the text of articles affect post-click engagement? To address these questions I analyzed a large, client-side log data of over 7.7 million page views of 66,821 news articles from seven popular news publishers …</p></p>
</li>
<li>
<span class="date">Feb <strong>24</strong></span>
<h3><a href="feedback-exp.html">When Do People Expect More Likes and Comments on Facebook and From Whom?</a></h3>
<p><p>Despite the fact that social media has been around for over a decade now, we still know reletively little about people's expectations for getting attention and responses from their online friends. In <a href="/papers/feedback_exp.pdf">this work</a> we examine the often overlooked end of the attention ``transaction'': the content producer's expectation to be …</p></p>
</li>
<li>
<span class="date">Jan <strong>10</strong></span>
<h3><a href="contribution-effect.html">Changes in Engagement Before and After Posting to Facebook</a></h3>
<p><p>Our recent work got accepted to CHI 2016! See the
<a href="/pages/publications.html">publications</a> page for the full paper. </p>
<p>I think this work is interesting for three main reasons: the questions we addressed, the methodology for answering them, and the answers we arrived at. </p>
<p>The fundamental question we addressed in this work is …</p></p>
</li>
<li>
<span class="date">Feb <strong>28</strong></span>
<h3><a href="wsdm-2014-summary-opinions-and-other-thoughts.html">WSDM 2014 Summary, Opinions and Other Thoughts</a></h3>
<p><p>It was a great attending WSDM this year, right at the heart of NYC! I could not possibly cover all that went down there in the three days of the conference, but would use this post to highlight sessions and talks that I attended and found particularly interesting.</p>
<p>The conference …</p></p>
</li>
<li>
<span class="date">May <strong>19</strong></span>
<h3><a href="sem-exp-api.html">Our Semantic Keyword Expansion API is ON!</a></h3>
<p><p>If you ever wanted to track a certain topic on Twitter, you probably wondered what keywords people are using to refer to it. Our new Semantic Expansion API lets you do exactly that - given a short list of keywords, it finds keywords that appear in a similar context or directly …</p></p>
</li>
<li>
<span class="date">Apr <strong>28</strong></span>
<h3><a href="using-tweepy.html">Best Practices for Querying Twitter API using (forked) tweepy</a></h3>
<p><p>Over the past few weeks I have been accessing Twitter API from python
using tweepy. In the course of doing so, I added support in tweepy for
using multiple authentication handlers, monitoring rate limits, waiting
when running out of calls, support for search using Twitter API v1.1,
proper pagination …</p></p>
</li>
<li>
<span class="date">Mar <strong>30</strong></span>
<h3><a href="extracting-diurnal-patterns.html">Extracting Diurnal Patterns of Real World Activity from Social Media</a></h3>
<p><p>Our most recent work got accepted to ICWSM 2013! See the
<a href="/pages/publications.html">publications</a> page for the full paper.</p>
<p>Here is the abstract:<br>
In this study, we develop methods to identify verbal expressions in
social media streams that refer to real-world activities. Using
aggregate daily patterns of Foursquare checkins, our methods extract …</p></p>
</li>
</ul>
</section>
</div>
<div class="4u">
<section style="border-top: solid 1px #ddd; border-top-color: rgba(255,255,255,0.05); padding: 1.3em 0 0 0;">
<header>
<h2>Contact</h2>
</header>
<ul class="social">
<li class="facebook"><a href="https://www.facebook.com/nirgr" class="icon48 icon48-1">Facebook</a></li>
<li class="twitter"><a href="http://twitter.com/grinbergnir" class="icon48 icon48-2">Twitter</a></li>
<li class="linkedin"><a href="http://www.linkedin.com/pub/nir-grinberg/44/23b/15" class="icon48 icon48-4">LinkedIn</a></li>
<li class="googleplus"><a href="https://plus.google.com/110010581376861389601" class="icon48 icon48-6">Google+</a></li>
</ul>
<h3>Email</h3>
<a href="mailto:last.first name -AT- gmail.com">last.first name -AT- gmail.com</a>
<h3>Address</h3><p>177 Auntington Ave., 10th floor, Boston, MA 02115</p>
</section>
<section style="border-top: solid 1px #ddd; border-top-color: rgba(255,255,255,0.05); padding: 1.3em 0 1.3em 0;">
<header>
<h2>Blogroll</h2>
</header>
<ul>
<li><a href="http://lazerlab.net/">Lazer Lab</a></li>
<li><a href="http://www.cs.cornell.edu/">CS @ Cornell</a></li>
<li><a href="https://s.tech.cornell.edu/">Social Technologies Lab @ Cornell Tech</a></li>
<li><a href="https://research.facebook.com/datascience">Facebook Core Data Science</a></li>
<li><a href="http://www.iq.harvard.edu/">Harvard IQSS</a></li>
<li><a href="http://www.networkscienceinstitute.org/">Network Science Institute</a></li>
</ul>
</section>
</div>
</div>
<!--<div class="row">
<div class="4u">
</div>
<div class="4u">
<section>
<header>
<h2>Categories</h2>
</header>
<ul class="divided">
<li><a href="./category/uncategorized.html">Uncategorized</a></li>
</ul>
</section>
</div>
<div class="4u">
</div>
</div>-->
<div class="row">
<div class="12u">
<!-- Copyright -->
<div id="copyright">
<ul class="links">
<li>© Nir Grinberg 2018 </li>
<!--<li>Images: <a href="http://facebook.com/DreametryDoodle">Dreametry Doodle</a> + <a href="http://iconify.it">Iconify.it</a></li>-->
<li>Theme adapted from: <a href="http://html5up.net">HTML5 UP</a></li>
</ul>
</div>
</div>
</div>
</section>
</div>
<script src="/theme/js/jquery.min.js"></script>
<script src="/theme/js/jquery.dropotron.js"></script>
<script src="/theme/js/config.js"></script>
<script src="/theme/js/skel.min.js"></script>
<script src="/theme/js/skel-panels.min.js"></script>
<!--[if lte IE 8]><script src="js/html5shiv.js"></script><link rel="stylesheet" href="/theme/css/ie8.css" /><![endif]-->
</body>
</html>