So we wondered: How could we take that same web-crawling process and subvert it? Could we scrape a decently massive dataset and produce something wonderful?
- Vaccine Education Summit
- Bitcoin Summit
- Ernie's Favorites
- THE R3VOLUTION CONTINUES
- "It's Not My Debt"
- Fascist Nation's Favorites
- Surviving the Greatest Depression
- The Only Solution - Direct Action Revolution
- Western Libertarian
- S.A.F.E. - Second Amendment is For Everyone
- Freedom Summit
- Declare Your Independence
- FreedomsPhoenix Speakers Bureau
- Wallet Voting
- Harhea Phoenix
- Black Market Friday
But there’s a dark side to this process: The countless marketing companies and hackers who write web-crawling scripts to gather massive data sets that serve their own ends.
We hit upon a ripe target: Food Network has amassed one of the richest repositories of cookery available today: Its website racks up over 200 million pageviews a month. But go try and find the perfect Bolognese recipe in 10 minutes. You can’t. There’s simply too much information, and it’s virtually impossible to extract any trends or heuristics from the dumb progression of web pages. This is the state of the web in a nutshell.
Things quickly got complicated. You can’t simply go out and scrape a massive site like the Food Network’s without getting sued—those voluminous terms of service agreements that you find at the bottom of most websites are designed to prevent anyone from taking data and republishing it. So we asked Food Network very, very nicely: Would you be willing to let us scrape your data, with the aim of creating as many infographics as we can dream up? Pretty please? Amazingly, Food Network agreed. (Thanks Danielle!)
Additional Related items you might find interesting:Related items:
News Link • Health and Physical Fitness
News Link • Obama Administration
News Link • Science, Medicine and Technology
News Link • Russia
News Link • Weapons/Weaponry
News Link • Sexuality: Sex and the Law
News Link • Social Networking/Social Media