Blog

Macro-causality and social science

Consider a little science experiment we’ve all done, to find out if a switch controls a light. How many data points does it usually take to convince you? Not many! Even if you didn’t do a randomized trial yourself, and observed somebody else manipulating the switch you’d figure it out pretty quickly. This type of science is easy!

One thing that makes this easy, is that you already know the right level of abstraction for the problem: what a switch is, that it has two states, and that it often controls things like lights. What if the data you had was actually a million variables, representing the state of every atom in the switch, or in the room?

Even though, technically, this data includes everything about the state of the switch, it’s overkill and not directly useful. For it to be useful, it would be better if you could boil it back down to a “macro” description that is just a switch with two states. Unfortunately, it’s not very easy to go from the micro description to the macro one. One reason for this is the “curse of dimensionality”: a few samples of a million dimensional space is considered very under-sampled, and directly applying machine learning methods using this type of data often leads to unreliable results.

As an example of another thing that could go wrong, imagine that we detect, with p<0.000001, that atom 173 is a perfect predictor of the light being on or off. Headlines immediately proclaim the important role of atom 173 in production of light. A complicated apparatus to manipulate atom 173 is devised only to reveal… nothing. The role of this atom is meaningless in isolation from the rest of the switch. And this hints at the meaning of “macro-causality” – to identify (simple) causal effects, we first have describe our system at the right level of abstraction. Then we can say that flipping the switch causes the light to go on. There is a causal story involving all the atoms in the switch, electrons, etc., but this is not very useful.

Social science’s micro-macro problem

Social science has a similar micro-macro problem. If we get “micro” data about every decision an individual makes, is it possible to recover the macro state of the individual? You could ask the same where the micro-variables are individuals and you want to know the state of an organization like a company.

Currently, we use expert intuition to come up with macro-states. For individuals, this might be a theory of personality or mood and include states like extroversion, or a test of depression, etc. After dreaming up a good idea for a macro-state, the expert makes up some questions that they think reflect that factor. Finally, they ask an individual to answer these questions. There are many places where things can go wrong in this process. Do experts really know all the macro states for individuals, or organizations? Do the questions they come up with accurately gauge these states? Are the answers that individuals provide a reliable measure?

Most of social science is about answering the last two questions. We assume we know what the right macro-states are (mood, personality, etc.) and we just need better ways to measure them. What if we are wrong? There may be hidden states underlying human behavior that remain unknown. This brings us back to the light switch example. If we can identify the right description of our system (a switch with two states), experimenting with the effects of the switch is easy.

The mapping from micro to macro is sometimes called “coarse-graining” by physicists. Unfortunately coarse-graining in physics usually relies on reasoning based on the physical laws of the universe, allowing us, for instance, to go from describing a box of atoms with many degrees of freedom to a simple description involving just three macro-variables: volume, pressure, and temperature.

Finding ways to automate coarse-graining for complex systems, as arise in social science, is one of the main goals of my research. One simple idea, the CorEx principle, has motivated a lot of this work. The principle says that “a good macro-variable description should explain most of the relationships among the micro-variables.” We have gotten some mileage from this idea, finding useful structure in gene expression data, social science data, and (in ongoing work) brain imaging, but I suspect it’s far from enough to completely solve this problem.  Coarse-graining that allows us to simplify the causal description of our system (as in the light switch example) seems like a fruitful angle to push this research further, and I hope to make or see more progress on this question in the future. (A few ideas about this that I’m aware of are here: 1 2 3 4 5. I would love to know of things I’ve missed! )


Source: Apparent Horizons

The “Grue” problem (and deep learning)

The Grue language doesn’t have words for “blue” or “green”. Instead Grue speakers have the following concepts:

grue: green during the day and blue at night

bleen: blue during the day and green at night

(This example is adapted from the original grue thought experiment.) To us, these concepts seem needlessly complicated. However, to a Grue speaker, it is our language that is unnecessarily complicated. For him, green has the cumbersome definition of “grue during the day and bleen at night”.

How can we wipe the smug smile off this Grue speaker’s face, and convince him of the obvious superiority of our own concepts of blue and green? What we do is sneak into his house at night and blindfold and drug the Grue speaker. We take him to a cave deep underground and leave him there for a few days. When he wakes up, he has no idea whether it is day or night. We remove his blindfold and present him with a simple choice: press the grue button and we let him go, but press the bleen button… Now he’s forced to admit the shortcomings of “grue” as a concept. By withholding irrelevant extra information (the time of day), grue does not provide any information about visual appearance. Obviously, if we told him to press the green button, he’d be much better off.

We say that grue-ness and time of day exhibit “informational synergy” with respect to predicting the visual appearance of an object. Synergy means the “whole is more than the sum of the parts” and in this case, knowing either the time of day or the grue-ness of an object does not help you predict its appearance, but knowing both together gives you perfect information.

Grues in deep learning

This whimsical story is a very close analogy for what happens in the field of “representation learning”. Neural nets and the like learn representations of some data consisting of “neurons” that we can think of as concepts or words in a language, like “grue”. There’s no reason for generic deep learners to prefer a representation involving grue/bleen to one with blue/green because either will have the same ability to make good predictions. And so most learned representations are synergistic and when we look at individual neurons in these representations they have no apparent meaning.

The importance of interpretable models is becoming acutely apparent in biomedical fields where blackbox predictions can be actively dangerous. We would like to quantify and minimize synergies in representation learning to encourage more interpretable and robust representations. Early attempts to do this are described in this paper about synergy and another paper demonstrates some benefits of a less synergistic factor model.

Revenge of the Grue

Now, after making this case, I want to expose our linguo-centrism and provide the Grue apologist’s argument, adapted from a conversation with Jimmy Foulds. It turns out the Grue speakers live on an island that has two species of jellyfish: a bleen-colored one that is deadly poisonous and a grue-colored one which is delicious. Since the Grue people encounter these jellyfish on a daily basis and their very lives are at stake, they find it very convenient to speak of “grue” jellyfish, since in the time it takes them to warn about a “blue during the day but green at night jellyfish”, someone could already be dead. This story doesn’t contradict the previous one but highlights an important point. Synergy only makes sense with respect to a certain set of predicted variables. If we minimize synergies in our mental model of the world, then our most common observations and tasks will determine what constitutes a parsimonious representation of our reality.

Acknowledgments

I want to thank some of the PhD students who have been integral to this work. Rob Brekelmans did many nice experiments for the synergy paper. He has provided code for the character disentangling benchmark task in the paper. Dave Kale suggested key aspects of this setup. Finally Hrayr Harutyunyan has been doing some amazing work in understanding and improving on different aspects of these models. The code for the disentangled linear factor models is here, I hope to do some in depth posts about different aspects of that model (like blessings of dimensionality!).


Source: Apparent Horizons

Twitter bots for good, and information contagion!

Our latest work, titled “Evidence of complex contagion of information in social media: An experiment using Twitter bots” was published in Plos One on September 22, 2017!

In this study, in collaboration with Bjarke Mønsted, Piotr Sapieżyński, and Sune Lehmann from the Denmark Technical University (DTU), we studied the effects of deploying positive interventions on Twitter using social bots.

The DTU team developed and deployed 39 Twitter bots, which connected within the community of users of San Francisco, during the second half of 2014. Starting in early October 2014 and throughout the rest of the year, the bots, some of which accrued thousands of followers, started to introduce positive memes, (listed in the table), to foster public health, fitness behaviors, and doing social good.

By using mathematical modelling in combination with statistical techniques, we used the data we collected to study how information spreads on Twitter. In particular, we seek to understand whether information passes from person to person like an epidemic spreading (or simple contagion), where each exposure to a virus (or likewise a meme) yields an independent probability of contracting the given disease, or otherwise whether being exposed to the meme multiple times from multiple sources greatly enhances the probability of that meme being adopted/retweeted by a user (complex contagion). 

Our analysis shows that the complex contagion hypothesis is the most likely to fully capture information diffusion dynamics on Twitter. By means of our experiment, in which Twitter users naturally partitioned themselves in groups following one bot, two bots, three bots, etc., we were capable of recording the number and sources of exposures of memes for each user in our pool, and therefore estimate, for the first time in a setting similar to a semi-controlled experiment, what factors play a role in information diffusion online: it appears that, for the type of positive memes we introduced, seeing them from multiple sources greatly enhanced the probability of retweeting the meme.

We hope to use what we learned from this study to improve our ability to deliver online interventions in the future!

You can read the rest of the study on Plos One!

Cite as:

Mønsted B, Sapieżyński P, Ferrara E, Lehmann S (2017) Evidence of complex contagion of information in social media: An experiment using Twitter bots. PLOS ONE 12(9): e0184148. https://doi.org/10.1371/journal.pone.0184148

 Press coverage:

  1. Researchers find that Twitter bots can be used for good – Tech Crunch
  2. Twitter Bots Can Encourage Decent Conduct, Not Just Fake News – News18
  3. Twitter bots for good: USC ISI study reveals how information spreads on social media – EurekAlert!

 


Source: Emilio

Diffusion of ISIS propaganda on Twitter

My latest work titled “Contagion dynamics of extremist propaganda in social networks” has been published on Information Sciences. The study aims at modeling and understanding the diffusion of extremist propaganda, in particular content in support of ISIS, on social media like Twitter.

Starting from a list of twenty-five thousand annotated accounts that have been associated with ISIS and suspended by Twitter, we obtained a large Twitter dataset of over one million posts these users generated. We studied network and temporal activity patterns, and investigated the dynamics of social influence within ISIS supporters. 

To quantify the effectiveness of ISIS propaganda and determine the adoption of extremist content in the general population, we drew a parallel between radical propaganda and epidemics spreading. We identified information broadcasters and influential ISIS supporters and showed that they generate highly-infectious cascades of information contagion.

To read further, please refer to the published journal version. The paper is also available on arxiv.

Cite as:

Emilio Ferrara. Contagion dynamics of extremist propaganda in social networks. Information Sciences (2017) doi:10.1016/j.ins.2017.07.030


Source: Emilio

#MacronLeaks, bots, and the 2017 French election

My latest work investigates the #MacronLeaks disinformation campaign that occurred in the run up to the 2017 French presidential election.

Using a large dataset containing nearly 17 million tweets posted by users in the period between the end of April, and May 7, 2017 (Election Day), I first isolated the campaign that was carried out to allegedly reveal frauds and other illicit activities related to moderate candidate Emmanuel Macron, and in support of far-right candidate Marine Le Pen.

New yet simple machine learning techniques devised specifically to analyze the millions of users appearing in this dataset revealed a large social bot operation and pointed to nearly 18 thousand bots deployed to push #MacronLeaks and related topics. The campaign attracted significant attention on the eve of Election Day, engaging overall nearly 100 thousand users in the time span of a few days.

The analysis revealed important new insights about social bot operations and disinformation campaigns on online social media:

  1. Many bot accounts that supported alt-right narrative in the context of #MacronLeaks were originally created shortly prior to the 2016 U.S. presidential election and used to support the same views in the context of American politics. The accounts went dark after November 8, 2016, only to re-emerge at the beginning of May 2017 to push #MacronLeaks, attack Macron, and support the far-right candidate Marine Le Pen. This corroborates a recent hypothesis about the existence of black markets for reusable political botnets. 
  2. The audience engaged with #MacronLeaks was mainly English-speaking American userbase, rather than French users. Their prior interests prominently feature support for Trump and Republican views, as well as more extreme, alt-right narratives. This suggests a possible explanation for the scarce success of the disinformation campaign: French users, those more likely to vote in support of Macron, were not mobilized nor significantly engaged in discussing these document leaks.

The paper, titled “Disinformation and Social Bot Operations in the Run Up to the 2017 French Presidential Election”, is set for publication on August 7, 2017 in the peer-reviewed journal First Monday. To learn more about this work, read the preprint paper available on SSRN.

Cite as:

Emilio Ferrara. Disinformation and Social Bot Operations in the Run Up to the 2017 French Presidential Election. First Monday, 22(8), 2017

Coverage

  1. Fake news bots are so economical, you can use them over and over – Harvard NiemanLab
  2. Pro-Trump Twitter bots were also used to target Macron, research shows – The Verge
  3. There’s a Bit of Overlap Between Bots Trying to Manipulate American and French Elections – New York Magazine
  4. Research links pro-Trump, anti-Macron Twitter bots – The Hill
  5. The Same Twitter Bots That Helped Trump Tried to Sink Macron, Researcher Says – VICE

Press in non-English media

  1. Macron Leaks : Les bots pro-Trump utilisés dans la campagne de désinformation – Le Monde (in French)


Source: Emilio

Gene expression updates

The work with Shirley Pepke on using CorEx to find patterns in gene expression data is finally published in BMC Medical Genomics.

Shirley wrote a blog post about it as well. She will present this work at the Harvard Precision Medicine conference and we’ll both present at Berkeley’s Data Edge conference.

The code we used for the paper is online. I’m excited to see what people discover with these techniques, but I also can see we have more to do. If speed is an issue (it took us two days to run on a dataset with 6000 genes… many datasets can have an order of magnitude more genes), please get in touch as we have some experimental versions that are faster. We are also working on making the entire analysis pipeline more automated (i.e. connecting discovered factors with known biology and visualizing predictive factors.) To that end, I want to thank the Kestons for supporting future developments under the Michael and Linda Keston Executive Directorship Endowment.

 


Source: Apparent Horizons

Millions of social bots invaded Twitter!

Our work titled Online Human-Bot Interactions: Detection, Estimation, and Characterization has been accepted for publication at the prestigious International AAAI Conference on Web and Social Media (ICWSM 2017) to be held in Montreal, Canada in May 2017!

The goal of this study was twofold: first, we aimed at understanding how difficult is to detect social bots on Twitter respectively for machine learning models and for humans. Second, we wanted to perform a census of the Twitter population to estimate how many accounts are not controlled by humans, but rather by computer software (bots).

To address the first question, we developed a family of machine learning models that leverages over one thousand features characterising the online behaviour of Twitter accounts. We then trained these models with manually-annotated collections of examples of human and bot-controlled accounts across the spectrum of complexity, ranging from simple bots to very sophisticated ones fueled by advanced AI. We discovered that, while human accounts and simple bots are very easy to identify, both by other humans and by our models, there exist a family of sophisticated social AIs that systematically escape identification by our models and by human snap-judgment.

Our second finding reveals that a significant fraction of Twitter accounts, between 9% and 15%,  are likely social bots. This translates in nearly 50 million accounts, according to recent estimates that put the Twitter userbase at above 320 million. Although not all bots are dangerous, many are used for malicious purposes: in the past, for example, Twitter bots have been used to manipulate public opinion during election times, to manipulate the stock market, and by extremist groups for radical propaganda.

To learn more, read our paper: Online Human-Bot Interactions: Detection, Estimation, and Characterization.

Cite as:

Onur Varol, Emilio Ferrara, Clayton Davis, Filippo Menczer, Alessandro Flammini. Online Human-Bot Interactions: Detection, Estimation, and Characterization. ICWSM 2017

 

Press Coverage

  1. CMO Today: Marketers and Political Wonks Gather for SXSW – The Wall Street Journal
  2. Huge number of Twitter accounts are not operated by humans – ABC News
  3. Up to 48 million Twitter accounts are bots, study says – CNET
  4. R u bot or not? – VICE
  5. New Machine Learning Framework Uncovers Twitter’s Vast Bot Population – VICE/Motherboard
  6. A Whopping 48 Million Twitter Accounts Are Actually Just Bots, Study Says – Tech Times
  7. Study reveals whopping 48M Twitter accounts are actually bots – CBS News
  8. Twitter is home to nearly 48 million bots, according to report – The Daily Dot
  9. As many as 48 million Twitter accounts aren’t people, says study – CNBC
  10. New Study Says 48 Million Accounts On Twitter Are Bots – We are social media
  11. Almost 48 million Twitter accounts are bots – Axios
  12. Twitter user accounts: around 15% or 48 million are bots [study] – The Vanguard
  13. Rise of the TWITTERBOTS – Daily Mail
  14. 15 per cent of Twitter is bots, but not the Kardashian kind – The Inquirer
  15. 48 mn Twitter accounts are bots, says study – The Economic Times
  16. 9-15 per cent of Twitter accounts are bots, reveals study – Financial Express
  17. Nearly 48 million Twitter accounts are bots: study – Deccan herald
  18. Study: Nearly 48 Million Twitter Accounts Are Fake; Many Push Political Agendas – The Libertarian Republic
  19. As many as 48 million accounts on Twitter are actually bots, study finds – Sacramento Bee
  20. Study Reveals Roughly 48M Twitter Accounts Are Actually Bots – CBS DFW
  21. Up to 48 million Twitter accounts may be Bots – Financial Buzz
  22. Up to 15% of Twitter accounts are not real people – Blasting News
  23. Tech Bytes: Twitter is Being Invaded by Bots – WDIO Eyewitness News
  24. About 9-15% of Twitter accounts are bots: Study – The Indian Express
  25. Twitter Has Nearly 48 Million Bot Accounts, So Don’t Get Hurt By All Those Online Trolls – India Times
  26. Twitter May Have 45 Million Bots on Its Hands – Investopedia
  27. Bots run amok on Twitter – My Broadband
  28. 9-15% of Twitter accounts are bots: Study – MENA FN
  29. Up To 15 Percent Of Twitter Users Are Bots, Study Says – Vocativ
  30. 48 million active Twitter accounts could be bots – Gearbrain
  31. Study: 15% of Twitter accounts could be bots – Marketing Dive
  32. 15% of Twitter users are actually bots, study claims – MemeBurn
  33. Almost 48 million Twitter accounts are bots – Click Lancashire

Press in non-English media

  1. Bad Bot oder Mensch – das ist hier die Frage – Medien Milch (in German)
  2. Studie: Bis zu 48 Millionen Twitter-Nutzer sind in Wirklichkeit Bots – T3N (in German)
  3. Der Aufstieg der Twitter-Bots: 48 Millionen Nutzer sind nicht menschlich – Studie – Sputnik News (in German)
  4. Studie: Bis zu 48 Millionen Nutzer auf Twitter sind Bots – der Standard (in German)
  5. “Blade Runner”-Test für Twitter-Accounts: Bot oder Mensch? – der Standard (in German)
  6. Bot-Paradies Twitter – Sachsische Zeitung (in German)
  7. 15 Prozent Social Bots? – DLF24 (in German)
  8. TWITTER: IST JEDER SIEBTE USER EIN BOT? – UberGizmo (in German)
  9. Twitter: Bis zu 48 Millionen Bot-Profile – Heise (in German)
  10. Studie: Bis zu 15 Prozent aller aktiven, englischsprachigen Twitter-Konten sind Bots – Netzpolitik (in German)
  11. Automatische Erregung – Wiener Zeitung (in German)
  12. 15 por ciento de las cuentas de Twitter son ‘bots’: estudio – CNET (in Spanish)
  13. 48 de los 319 millones de usuarios activos de Twitter son bots – TIC Beat (in Spanish)
  14. 15% de las cuentas de Twitter son ‘bots’ – Merca 2.0 (in Spanish)
  15. 48 de los 319 de usuarios activos en Twitter son bots – MDZ (in Spanish)
  16. Twitter, paradis des «bots»? – Slate (in French)
  17. Twitter compterait 48 millions de comptes gérés par des robots – MeltyStyle (in French)
  18. Twitter : 48 millions de comptes sont des bots – blog du moderateur (in French)
  19. ’30 tot 50 miljoen actieve Twitter-accounts zijn bots’ – NOS (in Dutch)
  20. 48 εκατομμύρια χρήστες στο Twitter δεν είναι άνθρωποι, σύμφωνα με έρευνα Πηγή – LiFo (in Greek)
  21. 48 triệu người dùng Twitter là bot và mối nguy hại – Khoa Hoc Phattrien (in Vietnamese)


Source: Emilio

Complex System Society 2016 Junior Scientific Award!

I was selected as recipient of the 2016 Junior Scientific Award by the Complex System Society!

The award readsEmilio Ferrara is one of the most active and successful young researchers in the field of computational social sciences. His works include the design and application of novel network-science models, algorithms, and tools to study phenomena occurring in large, dynamical techno-social systems. They improved our understanding of the structure of large online social networks and the dynamics of information diffusion. He has explored online social phenomena (protests, rumours, etc.), with applications to model and forecast individual behaviour, and characterise information diffusion and cyber-crime. 

14581453_10154195743616748_8983025224411157609_n


Source: Emilio

Twitter, Social Bots, and the US Presidential Elections!

First Monday: Social bots distort the 2016 U.S. Presidential election online discussion

Our paper titled Social bots distort the 2016 U.S. Presidential election online discussion was published on the November 2016 issue of First Monday and selected as Editor’s featured article!

We investigated how social bots, automatic accounts that populate the Twitter-sphere, are distorting the online discussion about the 2016 U.S. Presidential elections. In a nutshell, we discovered that:

  • About one-in-five tweets regarding the elections has been posted by a bot, totalling about 4 Million tweets posted during the month prior to the elections by over 400,000 bots.
  • Regular (human) users cannot determine whether the source of some specific information is another legitimate user or a bot: therefore, bots are being retweeted at the same rate as humans.
  • Bots are biased (by construction): Trump-supporting bots, for example, are producing systematically only positive contents in support of their candidate, altering the public perception by giving the impression that there is a grassroot positive and sustained support for that candidate.
  • It remains impossible, to date, to determine who’s behind these bots (the master puppeteers): single individuals, third-party organizations, and even foreign governments may be orchestrating these operations.

To know more, read our paper: Social bots distort the 2016 U.S. Presidential election online discussion

Cite as:

Alessandro Bessi, Emilio Ferrara. Social bots distort the 2016 U.S. Presidential election online discussion. First Monday 21(11), 2016

Press Coverage

  1. How the Bot-y Politic Influenced This Election – MIT Technology Review
  2. Facebook, Twitter & Trump – The New York Review of Books
  3. How Twitter bots played a role in electing Donald Trump – WIRED
  4. How Twitter bots helped Donald Trump win the US presidential election – Arstechnica
  5. On Twitter, No One Knows You Are a Trump Bot – Fast Company
  6. Election 2016 Belongs to the Twitter Bots – VICE
  7. Almost a fifth of election chatter on Twitter comes from bots – Fusion
  8. Study reports that nearly 20% of election-related tweets were ‘algorithmically driven’ – Talking New Media
  9. How Twitter bots affected the US presidential campaign – The Conversation
  10. Advertising is driving social media-fuelled fake news and it is here to stay – The Conversation
  11. 20% of All Election Related Tweets Came From Non-Humans – Futurism
  12. Twitter Bots Dominate 2016 Presidential Election: New Study – Heavy
  13. Tracking The Election With Social Media In Real-Time: How Accurate Is It? – Heavy
  14. BOTS ‘SWAY’ ELECTION Fake tweets by social media robots could swing US Presidential election – The Sun (UK)
  15. A fifth of all US election tweets have come from bots – ABC News
  16. There are 400,000 Bots That Just Tweet Political Views All Day – Investopedia
  17. Real, or not? USC study finds many political tweets come from fake accounts – Science Blog
  18. Software bots distort Donald Trump support on Twitter: Study – ETCIO
  19. How hackers, social bots, data analysts shaped the U.S. election – The Nation
  20. That swarm of political tweets in your feed? Many could be from bots – The Business Journals
  21. Software ‘bots’ distort Trump support on Twitter – New Vision
  22. Bots Invade Twitter, Spreads Misinformation On US Election – EconoTimes
  23. Software ‘bots’ seen skewing support for Trump on Twitter – The Japan Times
  24. US Presidential Elections 2016: Bot-generated fake tweets influencing US election outcome, says new study – Indian Express
  25. US elections 2016: Researchers show how Twitter bots are trying to influence the poll in favour of Trump – International Business Times
  26. Hillary vs Trump: Most of the election chatter online by Twitter bots, says study – Tech 2 First Post
  27. Twitter bots distort Trump support – iAfrica
  28. Social Media ‘Bots’ Working To Influence U.S. Election – CBS San Francisco
  29. Elezioni Usa: il 19% dei tweet elettorali è prodotto da software – Repubblica.it (in Italian)
  30. Almost a fifth of election chatter on Twitter comes from bots – Full Act
  31. Software ‘bots’ distort Trump support on Twitter: study – Yahoo! News
  32. Bots Will Break 2016 US Elections Results – iTechPost
  33. Scientist Worries Robot-Generated Tweets Could Compromise The Presidential Election – Newsroom America
  34. Software ‘bots’ distort Trump support on Twitter: study – Phys.org
  35. Spotlight: Fake tweets endanger integrity of U.S. presidential election – XinhuanNet
  36. New Study: Twitter Bots Amount for One-Fifth of US Election Conversation – Dispatch Weekly
  37. Are Robot generated Tweets compromising US Polls? – TechRadar India
  38. Fake tweets endanger integrity of US presidential election – Global Times
  39. Software ‘bots’ distort Trump support on Twitter: study – The Daily Star
  40. Software ‘bots’ distort Trump support on Twitter: study – News Dog
  41. Malicious Twitter bots could have profound consequences for the election – RawStory
  42. ‘Robot-generated fake tweets influencing US election outcome’ – DNA – Daily News & Analysis
  43. Sophisticated Bot-Generated Tweets Could Influence Outcome of US Presidential Election – Telegiz
  44. UIC Journal Shows ‘Bots’ Sway Political Discourse, Could Impact Election – NewsWise
  45. Bot-generated tweets could threaten integrity of 2016 US presidential election: Study – BGR.in
  46. Robots behind the millions of tweets: “The integrity at danger” – Svenska Dagbladet (in Swedish)
  47. Bot generated tweets influence US Presidential election polls – I4U News
  48. High percentage of robot-generated fake tweets likely to influence public opinion – NewsGram
  49. ‘Robot-generated fake tweets influencing US election outcome’ – Press Trust of India
  50. Robot-generated fake tweets influencing US election outcome: Study – IndianExpress
  51. Fake Tweets, real consequences for the election – Phys.org
  52. Real, or not? USC study finds many political tweets come from fake accounts – USC News
  53. We’re in a digital world filled with lots of social bots – USC News


Source: Emilio

Cancer in the time of algorithms

Edit: Also check out the story by the Washington Post and on cancer.gov.

Shirley is a collaborator of mine who works on using gene expression data to get a better understanding of ovarian cancer. She has a remarkable personal story that is featured in a podcast about our work together. I laughed, I cried, I can’t recommend it enough. It can be found on itunes and on soundcloud (link below).

As a physicist, I’m drawn towards simple principles that can explain phenomena that look complex. In biology, on the other hand, explanations tend to be messy and complicated. My recent work has really revolved around trying to use information theory to cut through messy data to discover the strongest signals. My work with Shirley applies this idea to gene expression data for patients with ovarian cancer. Thanks to Shirley’s amazing work, we were able to find a ton of interesting biological signals that could potentially have a real impact on treating this deadly disease. You can see a preprint of our work here.

I want to share one quick result. People often judge clusters discovered in gene expression data based on how well they recover known biological signals. The plot below shows how well our method (CorEx) does compared to a standard method (k-means) and a very popular method in the literature (hierarchical clustering). We are doing a much better job of finding biologically meaningful clusters (at least according to gene ontology databases), and this is very useful for connecting our discovery of hidden factors that affect long-term survival to new drugs that might be useful for treating ovarian cancer.

TCGA clusters

 

 


Source: Apparent Horizons