Y Combinator Application
Company url, if any:
If you have a demo, what's the url? For non-software, demo can be a video.
Describe your company in 50 characters or less.
AI based fact checking through social media data
What is your company going to make?
An analytics platform, backed by data science and machine learning, used to infer the credibility of events in real-time.
Which category best applies to your company?
Is this application in response to a YC RFS?
If yes, which one?
Where do you live now, and where would the company be based after YC?
Atlanta, GA, United States / New York, New York, United States
Email address of the founder who is filling out this application:
Please enter the url of a 1 minute unlisted (not private) YouTube video introducing the founders. (Follow the Video Guidelines.)
Please tell us about an interesting project, preferably outside of class or work, that two or more of you created together. Include urls if possible.
Tanu and Eric have known each other via the PhD program at Georgia Tech. Eric advised Tanu throughout her PhD, and they worked on a number of projects together. A paper from 2014, “Phrases that Predict Success on Kickstarter” (http://comp.social.gatech.edu/papers/cscw14.crowdfunding.mitra.pdf) resulted in many conversations outside of academia, with entrepreneurs wanting this as a product. Christian and Eric worked closely together throughout the summer of 2016 on a project called HackGT@UPC as one of the first international, collaborative hackathons. With Eric’s faculty oversight, Christian built this event over the course of a few months around the subject of “assistive technologies” (https://hack-gtupc.devpost.com/). How long have the founders known one another and how did you meet? Have any of the founders not met in person? Tanu and Christian met through Professor Eric Gilbert at Georgia Tech about 2 months ago. Eric is currently Tanu's PhD advisor and Christian worked with Eric as his teaching assistant during the GT study abroad program in Barcelona, Spain last summer. When Tanu came to Eric about turning her PhD research project into a full-blown company, Eric immediately thought of Christian. Tanu and Christian met a couple days later for coffee/discussion and jumped into customer discovery and a bit of market research, eventually leading them here to this application!
How far along are you?
Starting in 2014, Tanu has been working on this model, the science and the technologies surrounding it for about 3 years now. Our first major paper (http://credcount.com/whitepaper) was reviewed and published in late 2014 in a top CS conference. Although the majority of the work such as data collection was finished in approximately 8 or so months, there are certain analyses that have taken place up until today. That said, we are in what we consider the prototyping stage: the service/technology in its current form is not suited to any specific market but rather a proof of concept. With a little bit of modification to the machine learning models, we feel as though we can target new markets in as little as a couple weeks. Thus, Christian has spent the last few months trying to identify where we fit and have the biggest effect as a company/product. So far, we’ve considered news, politics, finance and the public sector. With contacts such as news anchor George Howell from CNN, political advisor and campaign manager Jason Boles and Senior Vice President of investments Mr. Jeff Wolk from Raymond James in the financial sector, we feel as though our best fit at this moment is to ride the “fake news” wave. Currently there is no window dressing to our model. In its current form, it is command-line driven (literally “python model.py [input]”), but we feel as though who our potential customers might be should be identified before we begin to work on any sort of graphical interface and/or API. That’s why we’ve come to YC. Our idea currently is a product/application built around our technology as an easy to use display of current news items and their credibility rankings such that someone might be able to read through an article with a presumed level of bias or accuracy up front. It comes through the narrative of “news literacy” and an attempt to have all sides of every story. Think if every fact or bullet from a news article could have a certain score for credibility such that the least items with the least standing fall to the way side and only the true and solid facts of the situation are left. How long have each of you been working on this? Have you been part-time or full-time? Please explain. Tanu has been working on CredCount since early 2014 as part of her PhD in social computing with her advisor Eric Gilbert at Georgia Tech. The majority of data collection and analysis led into early 2015, and she’s continued to circle back to it in a “part-time” role as a project with the most potential as a product/service outside of academia. Christian joined on early this year, 2017, in a product manager capacity to build this paper/model into a business. Working part-time while taking classes since his joining, he’s found himself devoting more time towards this venture than schoolwork, such that Tanu and Christian both are ready to commit to the project for the summer and however long after that.
Which of the following best describes your progress?
When will you have a prototype or beta?
How many active users or customers do you have? If you have some particularly valuable customers, who are they?
Do you have revenue?
What was your revenue in the last full calendar month?
Anything else you would like us to know regarding your revenue or growth rate?
How much money do you spend per month?
How much money does your company have in the bank now?
How long is your runway?
infinite, (no money in or out yet) but we think we can get to market in around 3 months
If you've applied previously with the same idea, how much progress have you made since the last time you applied? Anything change?
If you have already participated or committed to participate in an incubator, "accelerator" or "pre-accelerator" program, please tell us about it.
Why did you pick this idea to work on? Do you have domain expertise in this area? How do you know people need what you're making?
Tanu was attracted to the project by the prevalence of fake news spreading online following natural disasters such as Superstorm Sandy in 2012. As she saw people sharing a lot of incorrect or misleading information about the events, Tanu decided to track both big stories and smaller rumors with the goal of creating an app that could help ordinary people sort fact from fiction to make decisions that could be crucial. Tanu has an undergraduate degree in computer engineering and is currently finishing her PhD in computer science with a specialization in social computing. Christian is working on his undergraduate degree, but is focusing into the domain of people and society. He’s currently finishing up a Georgia Tech sponsored paper on the website http://voat.co with qualitative research through interviews with users. Based on Christian’s market research, the financial, political and news markets seem to be the most motivated to move forward with our work. In my discussions with Jason Boles, he would like to use our model as a sort of real-time polling for statements made from congressional candidate Kurt Wilson. While discussing the project with Jeff Wolk from Raymond James, Wolk proposed to use the service for adding credibility to events through social media for commodity trading decisions. Also, we have discussed the possibility of parsing analyst papers and summarizing the ideas and decisions they propose with scores as well. Lastly, in talks with George Howell, CredCount would be a great addition to the new media tool-set used in collaboration with pre-existing service like https://www.dataminer.co/. With this, instead of focusing on volume and noise alone, news outlets could intelligently use the world through social media as credibility to a story or event.
What's new about what you're making? What substitutes do people resort to because it doesn't exist yet (or they don't know about it)?
Basic event analysis has existed in its current form, based on volume and noise alone. The ability to take collective reactions through social media to help determine reliability of a specific event is what we bring to the table. Think of it as a haystack of social media posts and events that finding a needle in is close to impossible. We help shrink this haystack to make fact checking far easier.
Who are your competitors, and who might become competitors? Who do you fear most?
Based on data science and specific to the social computing field, we fear FactMata (https://medium.com/factmata/introducing-factmata-artificial-intelligence-for-political-fact-checking-db8acdbf4cf1) the most. As far as we know, to this day, they have no model or product and haven’t released anything close to the technology described within our whitepaper. That said, other fact checking websites will be our main competitors such as Snopes (http://www.snopes.com/) and PolitiFact. Otherwise, as Google and Facebook begin to roll out their own solutions these will be internal solutions to problems faced by the all media services on the internet. We do not fear these big guys as much because we stand to help all the rest using a paid for plug-in service if we decide to go that route.
What do you understand about your business that other companies in it just don't get?
The data science and linguistics portion of these potential markets. What sort of language implies the most credible financial analysis of one specific commodity? Or, how well a certain political statement goes over with a constituency? Or, what sort of language might you look out for when flagging something as potentially fake? Currently with some sort of human touch these sorts of questions can be answered but it might take a bit of time while doing so. We’re attempting to help narrow into the credible pieces.
How do or will you make money? How much could you make?
A specific case we’ve looked into is a contract or subscription service with big media companies. Similarly, DataMiner has made millions helping the news guys observe events as they happen going from labels of “this might be newsworthy” to “lots of people are talking about this; you should check this out”. We hope that our service will be an even bigger asset to news media companies bringing more than volume statistics through analytics. With early research, these contracts have figures in the millions.
How will you get users? If your idea is the type that faces a chicken-and-egg problem in the sense that it won't be attractive to users till it has a lot of users (e.g. a marketplace, a dating site, an ad network), how will you overcome that?
For a product of our own we’d need to attract potential users by having a narrative for the social benefit of “news literacy”. Minds are changed through emotion rather than factual concrete evidence so this application would be more of a training to see both sides idea instead of showing someone raw evidence against their position. In this manner we’d be very chicken-and-egg. As a service, we’ll get users by sales and marketing showing how much value we could add to markets like high frequency trading or politics. We’d have to lineup demos and grow through grassroots from the beginning.
Have you incorporated, or formed any legal entity (like an LLC) yet?
What kind of entity and in what state or country was the entity formed?
Please describe the breakdown of the equity ownership in percentages among the founders, employees and any other stockholders.
List any investments your company has received. Include the name of the investor, the amount invested, the premoney valuation / valuation cap, and the type of security sold (convertible notes, safes or stock).
If you have not formed the company yet, describe the planned equity ownership breakdown among the founders, employees and any other proposed stockholders.
Eric Gilbert (Georgia Tech Professor, Chief Scientist) - 8%, Tanushree Mitra - 46%, Christian Battaglia - 46%
Please provide any other relevant information about the structure or formation of the company.
Christian Battaglia will be in charge of most business-oriented tasks and product management of the service/platform. Tanushree will be focusing on data science, machine learning and the specifics of the model tailored to whatever market we decide on. Eric will be in an scientific advising role, based on his expertise through many years of social computing.
Are any of the founders covered by noncompetes or intellectual property agreements that overlap with your project? If so, please explain.
Who writes code, or does other technical work on your product? Was any of it done by a non-founder? Please explain.
Tanu wrote the code, setup the Turkers and has done all technical work on the project so far. Christian is finishing his undergrad at Georgia Tech and has about 3 total years of work experience in internship, co-operative and other part-time capacities. He’s comfortable reviewing all code for the project in its current form as it relates to AI and machine learning. Christian was also previously head of web development at the European startup Glovo (http://glovoapp.com). That said, all technology was developed by founding members.
Is there anything else we should know about your company?
Funded originally by a DARPA grant and under the umbrella of Georgia Tech, you might think that we would need to license this idea or give some credit to the university, but that is not the case. The primary paper has been out for over a year, meaning it cannot be patented. The nature of the technology--highly specific to social computing, machine learning and linguistics--form a steep barrier.
If you had any other ideas you considered applying with, please list them. One may be something we've been waiting for. Often when we fund people it's to do something they list here and not in the main application.
Christian’s ideas include:
- A social experiment much like Facebook where every login or session has you assume the role and identity of a fictitious person. All data created by you while in the role of this person persists upon session expiration or sign out. Each time you log back in you must put yourself in the shoes of the previously created identity and post as such breaking the echo chamber that is social media today while also having fun.
- Event-going today has turned into a very spontaneous unscheduled sort of thing. Often times I find myself wishing to get in touch with the people I met within the context of some random evening for pictures, videos and just networking in general. Think of this as a digital collection of your concert/sports tickets where each one expands into your experience prior and after the specific event!
- Having spent 3 summers abroad in Barcelona working closely with nightlife promoters, one thing they always talk about for general improvement of their day to day is a way to more easily get in touch with their network to predict figures for guest lists and to not spam as much. Think of this as a 1 degree removed connection of the promoter and I want to hand over my network list of those specific few that are headed to Barcelona for vacation. Hopefully this would be that he’d have my network only for the time that my friends are travelling to cut through on the spam.
Please tell us something surprising or amusing that one of you has discovered.
Christian: I over analyze social settings a bit too much and over the years I’ve noticed just how much weight we give first impressions… With that, I always try and give strangers and new contacts the time of day/benefit of the doubt while attempting to be my best self in return.
What convinced you to apply to Y Combinator?
When the triple request for startups based around news, jobs and democracy dropped, Tanushree and Christian immediately decided that it was time to apply. After years mulling around with the idea that this model could potentially be a business, we decided to jump all in.
How did you hear about Y Combinator?
Online and through friends