January 24, 2011

Is Snooth scraping data from CellarTracker?

I didn’t intend to write a followup to yesterday’s post, The TRUTH About Snooth Data, but a response from a CellarTracker user convinced me that there was more to tell. Many of you may know about CellarTracker, but for those who don’t, CellarTracker has been an institution of wine/tech. The first-ever major cellar management tool, Eric LeVine has worked tirelessly for almost a decade improving his application and servicing high-end wine consumers. He is also one of the most dedicated and altruistic people in the business (he doesn’t even charge for his service, he just asks that people “donate” if they like his software). In his own words:

“Since its launch in 2004, registration has grown to 126,000 members. On a typical day, you are tracking (adding or removing) close to 20,000 bottles for a total of 21.6 million bottles. The database includes 980,000 wines from 70,000 producers, one of the largest in the world. The CellarTracker community has also emerged as an abundant source of wine reviews with more than 1,600 wines reviewed in a typical day for a total of 1,663,000 wine reviews, all written by you, real wine enthusiasts. Amateur reviews are not a replacement for those written by professional critics, but you are generating as many reviews in six days as Robert Parker publishes in an entire year. You have created the largest collection of wine reviews in the world, all freely available for the wine-loving community to enjoy. The site is one of the most heavily visited wine websites in the world with 25 million page views per month from several hundred thousand unique visitors.”

Without a doubt this huge collection of data is a goldmine and any wine tech company would love to have all those wines, wineries, and especially reviews in their database. There were rumors at one time of data being challenged between CellarTracker and Snooth but the details were never revealed. However, as a result of yesterday’s article an avid CellarTracker user sent me evidence of more deception on the part of Snooth. From analysis it really appears as if Snooth IS scraping data, reorganizing it as their own, and using it to grow their business.

Here are some of the examples, provided by the CellarTracker user, from a review by our friend Robert Dwyer (notice the keywords I have highlighted in black in the tasting note below):

http://www.cellartracker.com/new/wine.asp?iWine=948573

Now notice the tags for the same wine on Snooth (I have highlighted the “tags” in red that match the text from Robert’s CellarTracker review):

http://www.snooth.com/wine/bolen-family-estates-merlot-2007/

Every (and I mean EVERY) single user tag on Snooth appears to have been scraped from the CellarTracker review (even the very strange non-wine tag of “sample”). I wondered if this was an isolated incident but it appears there are many more examples throughout the Snooth site. User tags for many wines appear to all be from content taken from CellarTracker.

http://www.cellartracker.com/wine.asp?iWine=1072748

And the same wine on Snooth:

http://www.snooth.com/wine/sybille-kuntz-riesling-scharz-schiefergestein-2009/

All of the keywords match (except for synonyms for vegetal to vegetables and acidity changed to acidic – an easy thing to have a computer algorithm adjust). The keyword “try” strikes me as most revealing coupled with the fact that there are NO reviews on Snooth for this wine. If there are no user reviews for the wine on Snooth, how does it have “User Tags?”

And there are more and more as I kept looking deeper:

http://www.cellartracker.com/wine.asp?iWine=834902

http://www.snooth.com/wine/joh-jos-prum-wehlener-sonnenuhr-riesling-spatlese-1986/

and

http://www.cellartracker.com/wine.asp?iWine=45308

http://www.snooth.com/wine/joh-jos-prum-wehlener-sonnenuhr-riesling-auslese-goldkapsel-1985/

I was actually disappointed to find this type of behavior from Snooth. A campaign of hyperbole and misrepresentation is one thing, but this is a new low.

It also raises a lot of questions in my mind. How much damage has this caused CellarTracker? How much has CellarTracker’s content helped grow Snooth’s database (not only of tags but adding wines to the database)? How much has that content helped attract users? And partners? What is the exposure for the content partners (Epicurious, Wine and Spirits, etc) that are leveraging this data? Will Eric LeVine sue?

Is Snooth scraping CellarTracker data? You be the judge.

  • This is horribly disappointing and unethical behavior. These notes are all just a few weeks old. I would love to see a valid explanation.

    -Eric LeVine
    CellarTracker.com

    • If true, is it not a violation of many items in your TOU?

      • Yes, or that is certainly what my attorneys think.

  • Pingback: Tweets that mention Is Snooth scraping data from CellarTracker? | VinTank -- Topsy.com()

  • This is big. And an important question to answer in the world of social media and transparency.

  • This is definitely disappointing. There is a huge gap between just misrepresenting some stats, which is really par for the course, and stealing content from another site. One is just a common, but somewhat sleazy, practice among businesses, and the other is unequivocally unethical. I’m glad that you have brought this to light, and I’m very interested to see what response Snooth has to this.

  • Andrew Hall

    Snooth lifts label pictures from Cellartracker and will not remove them *even when asked* by the owner.

    http://www.snooth.com/wine/gravner-breg-1999/

    Look at the image – that is my backyard and table. There are other wine labels which present similarly, having pieces of my furniture, but this is the most obvious and blatant theft.

    It is a real shame that they are so unethical.

    • Wallstreet

      Implied copywrite infringement. Do they have a photographer release from you? hmmmm?

    • At least the stole the picture of a good bottle! 😉

  • Brian Shapiro

    It’s so important for wine reviews “By the People” to come from those who actually review the wines! If this is whats going one it’s very sad. Reminds me of the whole “Yelp” issue a year or so ago when they would shuffle reviews etc.

  • Andrew Hall

    If you want further evidence of “keyword” stealing, it is not too hard to find.

    http://www.cellartracker.com/wine.asp?iWine=565180

    vs

    http://www.snooth.com/wine/river-village-cellars-syrah-estate-2006/

    It is amusing as the ordering of the words are the same as my tasting note, but their spiders are not tuned enough to pick up ‘lacks complexity’ and gives it a keyword of ‘complex.’ You can find this a lot.

    Snooth was informed of all this in 2007! And did nothing whatsoever about it even when complained to by me.

    • Wes Cook

      Looks like Snooth must have removed that review. Very interesting…and their bots couldn’t sort out the winery name either. Classic!

    • Guest

      I like the tag “romanian cheese.”

  • This is hard to believe, and I am glad that it was discovered. I hate to think that Eric needs to change the copyright on CellarTracker or anything, but there must be something we can do. Certainly letting the world know about this enormous ethics lapse is step one.

    • pmabray

      Doug – the best thing we can do is spread the word and make people aware.

  • Andrew Hall

    And another even more obvious one – the word ‘lanolin’ is not a common word.

    http://www.cellartracker.com/wine.asp?iWine=526081

    vs

    http://www.snooth.com/wine/kinkead-ridge-riesling-2007/

    A.

  • If Eric can identify Snooth’s scraper bot, then he can add it to his robots.txt and if it ignores the robots.txt directives there are a number of technical measures he can take:

    1) Take a long time to serve requests to the bot, slowing it down
    2) Serve very large responses, causing it to use lots of bandwidth
    3) Poison the Snooth database with incorrect data
    4) Redirect the bot to an irrelevant page

    This is the kind of behaviour we (sadly) expect from a struggling startup, not from a well-funded, mature business like Snooth.

  • Interesting. Very interesting. Doubtful that there is anything to sue over, but it’s definitely an integrity issue and a short cut and monetization on the backs of others without credit.

    I wondered how they could build such a comprehensive site with a skeleton crew of people. It would appear now that it’s because they borrow liberally.

    I’ll be curious to see what Snooth’s response is … one suspects that hitting the ignore button might be easier than addressing …

    Interesting set of posts, Paul.

    • Jeff, There actually is something to sue for, with statutory damages. Look for a copyright story on Monday, you know where.

  • Jess Poshepny

    This is terrible mews considering Snooths recent partnership with SCV. Any thoughts on their second company, Lot 18?

    • Anonymous

      I wonder what SCV thinks about this?

  • Just had a nice chat with one of my lawyers, and we are actively exploring our options.

    -Eric

    • Saddle up boys!

    • BigE

      Google scrapes day in day out. I dont see any legal options here. I do notice snooth does a much better job tagging urls seo wise. Cellar tracker should invest in a little seo overhaul. When Snooth launched I did a peak into their site code and saw lots of black hat SEO stuff as well.

      • pmabray

        Black hat? Can you elaborate?

      • Sean Ness

        Google scraps AND LINKS BACK to what they scrape. A big difference!

      • winelaw

        You cannot use “comments” or “opinions” and claim them as your own. There is plenty of case law on this matter, I hope they have good lawyers or just stop this BS.

  • Eric LeVine must have known about this for some time. Why he hasn’t sued them I’m not sure. But this is clearly dastardly in the extreme. Much worse than all the other stuff you’ve posted.

  • Martin

    I receive daily emails from Snooth. From time to time have found the blog to be informative. The info you presented yesterday and today is disturbing. If all your facts bear out, I hope bringing it to light will cause Snooth to cease and desist their unethical behavior!

  • One thing is for certain in the wine world: You can develop relationships honestly from the ground up, which will take time or you will find yourself head first diving into the ground. I have always felt that if you want to content share, then get/give permission and everyone wins. Steal it and you should face the consequences of your actions, which is disrespect from the wine world.

    The question Snooth should have asked themselves before taking this course is How do you raise $2 – 5M on a business plan that involves content hijacking or capital/trademark theft? Is that a Federal Crime? There is a larger consequence here that no one is bringing up that should be brought to into the light…

    Eric, sue those guys. You built it, don’t let them steal it. Send an email blast asking for a donation to cover these legal bills and I bet you will see the largest infusion of donations you’ve seen since releasing Cellar Tracker.

    Best in life and wine…

    Alex Andrawes
    CEO, Wines.com & Personal Wine

  • Mindmuse

    I think their reputation is going to go to squat in a matter of days amongst the savvy and influential in the wine world, but I think it will take some aggressive (and deserved) legal action to try to rectify what this is doing for the casual wine lover.

  • I hate to say that I thought this was common knowledge. Glad someone finally said it out loud, maybe it will lead to changes.

    Paul, did you look at any of the other online TN sites? I know CT is the biggest, but would be interesting to see all the networks they are scrapping.

    • pmabray

      Your help would be appreciated there. Thx for the comment my friend.

  • Pingback: Terroirist » Blog Archive » News Roundup: Theft!()

  • John F.

    Is this a suprise? Since it’s inception Snooth’s success has been based on their position on a Google Search, not on providing any sort of quality information about wine, the winery, the style, the vintage, the grape or anything else related to the business of wine or the love of wine. After watching many other sites that have tirelessly created original material on wine and wineries lose out to this shell of a company that provides nothing of merit on their own, it is with pleasure that someone else has finally taken notice. Now if only they hadn’t gotten rich in the process…

  • Pingback: Snooth & CellarTracker - Legal & Ethical Ain’t Always The Same | Wine Industry Insight()

  • Pingback: NEWSFETCH - January 25, 2011 | Wine Industry Insight()

  • John

    It’s telling that Mr. Levine’s reaction is to threaten litigation. I don’t recall ever “claiming our winery name” (which might imply consent to share info) but I note – as have many others – that images on Snooth are pulled from our copyrighted website. Unless Mr. Mabray has made a factual misrepresentation, finding a court that would entertain a complaint of libel might be challenging. However, copyright infringement might be a bit easier.

  • Wow.

    I’m guessing there hasn’t been a response yet from the Snooth side? I **really** hope that one is forthcoming.

    I’m going to create a CT entry with tags like “GeddyLeeIsAwesome” and “Solomon Grundy” and “feverish transducers” (that last one would be an awesome band name, btw) and see if they show up on Snooth in a few days…

    Ok, maybe I won’t do that (Eric – you’re welcome! :-).

    • pmabray

      It is interesting that this just “accidentally” slipped through their attention considering it appears they had sophisticated algorithms to obfuscate the data. Moreover, they only discuss “tasting tags” but fail to mention how they acquired some of these very obscure wines in the database. Were those taken as well? This is just a PR spin apology and is disappointing to say the least.

  • Pingback: Is Snooth scraping data from CellarTracker? | VinTank | Crack Clothing()

  • Chrisr

    One thing that CellarTracker, and other similar sites should do is get in contact with Google, Yahoo, and other search engine providers, and complain to them about Snooth’s supposed activity. If they feel that the complaints are valid Snooth will drop completely off of their radar.

  • Chrisr

    What I find more offending is that if they don’t have any useful data for a entry (i.e.: winery name, state county, etc) they find something to associate with it.

    In Texas there is a county named Kendall with 5 wineries. Guess what winery is plastered ALL over that region and what is not even mentioned (until this evening). Also I found that an active old winery name that had distilled spirits associated with it.

    How dumb are these folks?

  • Hamishwm

    Content is King……..or should that be Original content is King. I would not want to jump on the bandwagon to denigrate a company but it does sound like lazy practice from Snooth.

  • How refreshing to have someone look into this. I have always wondered how Snooth was able to scale as fast as they did. I am VERY interested to hear what the Snooth gang has to say for itself.

    Thanks Vintank crew for shedding some light on this.

  • This is stealing, and stealing is bad.
    I have to also state that I’m entirely missing the whole point behind Snooth. I tried many times to find useful information on that site, only to realize that there is none. Snooth comes up high in the search, so after a few failed attempts to get any info, now I simply ignore any references to Snooth web site.

  • As part of the team at PocketGrapes Digital Wine Diary for the Blackberry, I can tell you that it is time consuming and frustrating to collect your own wine data. That being said, PocketGrapes is building a catalog from the ground up, with the help of the user community. I hope that you get a chance to check us out!
    http://www.pocketgrapes.com

  • Pingback: Robots Battle Over Wine | 香港新媒體協會()

  • Pingback: Robots Battle Over Wine | Wine Resources Reviews and Ratings()

    • Rob

      I’ve stumbled onto Snooth REGULARLY over the last couple of years, thanks to what I consider tricky and deceptive search engine optimization. When I search for a wine, Snooth almost always shows up, with some copy in Google claiming to have scores and notes for that wine. However, when I click over, inevitably there is almost no information, but they get their traffic numbers inflated as a result of this tactic which I consider dishonest.

      Imagine my surprise, now that I find that what much of the LIMITED info of any value they even have was taken from the winery’s own press release, or lifted dishonestly from competitor sites! In my opinion they are dishonest shysters!

      I thought internet SEO nonsense like that got left behind in the 90’s!

      There is NO WAY they can claim that this was an accident. I really hope they get taken down a peg or two for this and their other SEO shenanigans.

  • Pingback: Terroirist » Blog Archive » News Roundup: It Rhymes With Wine()

  • Isn’t it interesting now that there’s a new story about Bing copying results from Google? Where does this digital mess end?… Any reply from the Snoothers yet? This story gets curioser and curioser….

  • Pingback: Vinotology » January 28th – The Good, the Bad, and the Ugly()

  • Pingback: 99+ pts – An interview with Bobby Parkerchuk | VinTank()

  • Pingback: Poor Harvest for Content Farms: Will Google’s new Algorithm Change the Online Playing Field? – The Buzz Bin()

  • Pingback: Palate Press Story Bank()

For Your Continued Enjoyment...

September 27, 2011

Cha, Cha, Cha, Changes at VinTank....

As we approach three years in business, so much has changed.  E-commerce threatens to become... more

September 12, 2011

An interview with the wine industry’s g...

Never before has the wine industry seen a superhero like @winesaleshulk (except... more

August 3, 2011

How Gap captured my email and I feel in love ...

Ok, this is genius. I was in the Napa Gap Outlet buying clothes for my... more

August 2, 2011

Gilt, please don’t jilt the wine indust...

Flash sale sites serve various purposes.  We at VinTank have mixed feelings about these entities... more