Snooth User: Andrew46

Search by region not working well - Particularly for Humboldt.

Posted by Andrew46, Jun 26, 2010.


I did a search just to see which of the local Humboldt wines are listed on Snooth.  There are problems with the search.  Firstly, 3 categories come up:  Humboldt, Humboldt County, and Humboldt/Trinity County.  This is confusing.

Second, many of the wines that come up on a search for Humboldt, are not from Humboldt at all.  Here are several examples:


This appears to be from Paso Robles, but has willow creek in the name.:


This one is actually cheese:


This one is an Aussie:


This one appears to be from Chile, but with at similar name to a Humboldt winery:

Price is $4990.00  Is that right?


This one is from Sonoma, but has willow creek in the name:


Can this issue be fixed, or is it an issue accross the board with the seach feature?

Thanks for looking at it.

Andrew (From Briceland Vineyards Winery in Humboldt County)

Reply by Philip James, Jun 27, 2010.

Andrew - thanks for bringing this to our attention, someone will look at it, and post a response here.


Reply by Gregory Dal Piaz, Jun 27, 2010.

Hi Andrew,


Thanks for bringing these exceptions to our attention. We are always working to improve on our extraction process, by which we populate fileds such as winery and region. These specific examples will be very helpful to the tech team working on this process.

As you can see we have mis-identified several wines as being from Willow Creek. The fact that the Willow Creek AVA covers two seperate counties obviously complicates the process, and moving forward we will probably have to list all Willow Creek wines as belonging to the subregion Humboldt/Trinity Counties.

Back to the wines though, some of the errors you find on Snooth are due to the extraction process we use, while others are fed into the system via retailer feeds that include this mistaken information. We are actually working on a process which will identify those listings (be it varietal, region or winery) in which we have less confidence, and will let our user base help to correct or confirm them.

In the meantime, our super users are encouraged to help where they see fit my taking full advantage of the edit functionality they have access to. It can be accessed by clicking on the edit/add icon located on the right of the screen under the "I Like This" widget. By clicking that button our super users will have access to all the information we have on a wine, and are able to correct and add information as needed.

I have gone through and quickly updated the examples you listed but please feel free to update any wines you find containing errors. We will be posting about our tech progress here in the forums.

We are all looking forward to better data here on Snooth, and we appreciate all the help our members have, and continue to offer!

By the way, nice looking Paella!


Reply by Andrew46, Jun 27, 2010.

Hi Gregory & Philip,

Thanks for looking into this.

I appreciate the challenge of maintaining a DB this large with multiple inputs.

These two still appear wrong:

Actually Sonoma?, but has willow creek in the name:

Actually Paso Robles?, but has willow creek in the name:

Everything else seems to be right. 

Question:  Is the main source on entry into the system online retailers?

Reply by ChipDWood, Jun 27, 2010.

"Question:  Is the main source on entry into the system online retailers?"

- Distributors maybe?  That's a great question, because if it is then there has to be a away of getting them to pay more attention to what comes from where.

There's a bunch of sloppy when it comes to organizing this stuff- and it certainly isn't just Snooth.  Sometimes even the labels mis-identify for Pete's sake.

Reply by Philip James, Jun 27, 2010.

The data comes mainly from wineries and retailers - and yes, we're at the mercy of sloppy input data, which our users and the snooth team directly spend a lot of time fixing. It would be nice to have cleaner input data (wineries?).

Reply by Andrew46, Jun 28, 2010.

Yes, if every winery put in the correct info about where the grapes were sourced, this would solve the problem.  Absent that, it seems that the system is guessing wrong sometimes.  Somehow, Villa Creek, or Willow Creek in the vineyard name gets confused with the region from which the grapes are souced, right?

Reply by Andrew46, Jul 12, 2010.

Hi Guys,

There is more new wrong listings going on to the site.

This one is from NY, but the winery is named "Willow Creek":

Top 3 in this list have the same issue:  NY wines but listed as from Humboldt:

This one is Paso Robles, says so in the name:

This one is Sonoma with willow creek as the vineyard name:

This one is an Aussie, but has "willow creek" as winery name:

Another Aussie:

More Aussie:

More Sonoma:

More Paso Robles:

I only made it through the first 5 pages of the list and this is what I could see.  It is dissapointing to look as what is listed on the site from Humboldt and find this many errors.  Can this issue be fixed?  I think there is a "bot" that thinks if willow creek is in the name, it must be humboldt.  This much wrong info would seem offputting to someone who is actually looking for humboldt wine, don't you think?

Also, Coates is very well represented in the search.  They are fine people.  I have not heard feedback on the wines any time recently.  Has anyone here had one?  If it was good, I'd love to hear which one it was so I can taste it.

Reply by Andrew46, Jul 12, 2010.

From your past post:

"I have gone through and quickly updated the examples you listed but please feel free to update any wines you find containing errors."

How do I do that?

Reply by Andrew46, Jul 25, 2011.

This issue has not been resolved.  Rivers Marie Pinot from Willow Creek Vineyard still comes up in a search for Humboldt.  I suppose the bot thinks that since it says Willow Creek it is in the Willow Creek App.  There are some Ausie wines that come up too still.  Also, some cheese comes up as if it is wine.  Do you guys take accurarate info seriously?  New features might be good, but getting your crawler to work right would make the your site not spew loads of wrong data.

Reply by Mark Angelillo, Jul 25, 2011.

Hi Andrew - there's no crawler at work here, data is provided by merchants and wineries both, and is accepted as provided until proven untrustworthy. This accounts for some inconsistencies as we eventually become consistent with reality.

That said, thanks for your comments. We do take data quality seriously and in fact pay several employees to work on it. We clearly have not looked at searches for Humboldt recently as data is added, and rely on your comments to help where you are experiencing low quality results.

We've added rules to remove Humboldt Fog and set up Willow Creek Vineyards as an Australian winery. Also found an Amador wine in the results and moved that. Thanks and feel free to send along other issues.

Reply by dmcker, Jul 25, 2011.

Mark, crawlers, algorithms, or not, sounds like you need to address these issues. Andrew has been exceedingly sincere in bring them to attention over a fair period of time. Snooth does have a rep for sloppiness in this area, and methinks visible remedial action would pay off in a number of ways....

Reply by Andrew46, Jul 25, 2011.


Sorry.  I am calling BS.  No merchant or winery is entering an australian wine and inputing the county as Humboldt or the AVA as Willow Creek.  Willow Creek is in the name and there must be some automated way it is getting dumped into the Humboldt search.  I have pointed this out multiple times.  I have spent time making corrections to the info for you.  The wrong info gets back in, somehow.

River's Marie wines with Willow Creek in the vineyard name also get caught in this category. 

But, I am glad that you have attempted to fix it.  There are still ausie wines like this:

and this:

Not sure where this one is from, but not Humboldt:

This is not info that is being put in by merchants or wineries.  Seriously.

Reply by Mark Angelillo, Jul 25, 2011.

Gentlemen - I can't and won't tell you we will fix all of Snooth's database of wines now. That would be a lie. We have worked on this for years and will continue to - trust me I know about this problem intimately and feel your pain when I (as they say) eat my own dog food. I do and Snooth does appreciate your sincerity and patience. I'm asking for you not to doubt mine as we work towards a better and more useful data set. Our interests are aligned on this 100%.

Andrew - data is provided by merchants (and wineries, however their data tends to be cleaner). In the case of Willow Creek being in the name, yes we do pull that out from the name and try to set the region where possible (the source did not provide a region and we do our best to predict it - this approach does have some issues). To add some visibility - we have not looked at Willow Creek Winery/Vineyards until today but it appears there is one in Australia, one in Niagra, and one in Amador. There's also an AVA, so you can see there are some challenges with names colliding. Without manual intervention this kind of inconsistency is difficult to expose programmatically. If the data already existed in a clean way somewhere, the need for Snooth would be reduced.

All this being said, let's talk about some of the things we are doing in this arena. We recently provided a way for wineries to lock down their Snooth search results to prevent bad data from polluting them. This is working out extremely well for those wineries who use this feature, and we are hoping to bring this strategy to regions soon.

We're also redoing the search page as it exists now and will be keeping in mind the extent to which we trust the data that is on the site. We have done the heavy lifting to improve the tracking of this information on the backend, and although this is not visible to the end user yet it was a good chunk of work and we're close to reaping the benefits of that project.

As far as Humboldt goes, I just made a few more cleanups to the results. Hopefully they continue to look better.

Thank you all for your patience as we improve the experience on Snooth in many ways.

Reply by Andrew46, Jul 25, 2011.

These two statements only serve to confuse the fact that your system is pulliing info from the names of wineries and vineyards and putting them in the wrong category.  Whatever you call that automated system is the problem.

"data is provided by merchants (and wineries, however their data tends to be cleaner). In the case of Willow Creek being in the name, yes we do pull that out from the name and try to set the region where possible (the source did not provide a region and we do our best to predict it - this approach does have some issues)."

It would be, IMO, better to leave those wines that don't bother to note where they are from with that info blank, than put it in from the name and have it be wrong this often.  Just my idea.

Reply by Mark Angelillo, Jul 25, 2011.

Hi Andrew - I do appreciate your concern. It may not be clear immediately from what you see on the site, but we have performed this extraction on nearly 60% of the over 7 million wine listings that have been brought into Snooth over time. I agree it's very painful when our algorithm's choice is wrong, but this information feeds many of the other algorithms that run the site and we unfortunately have had to accept an error rate. We also continually work to clean things up as we go by adding specific rules (thousands of rules and exceptions are currently in play), and have built a number of tools internally to help us do this. Hopefully the data is continually better and eventually consistent.

I'm happy to answer any other questions about how this process works. It might be worth me writing some more extensive information about what we're doing to address this problem, but I don't have the time for that level of post just this second. It sounds like it would be important/helpful information to have out there though, so I'll make sure it gets on my list.

Reply by Andrew46, Jul 26, 2011.

I don't find it painful that your algorithim is wrong from time to time.  What was bugging me was that when I would bring this up, people would say that it had been addressed.  However, the issue had not actually been solved.  The other thing that bugged me was that people would say that the wrong info was not coming from a crawler.  That it was coming from whoever put the info in, like the winery, vendor etc.  That is total BS.  As you have noted, the "extractor" thinks a wine is from Humboldt because it has Willow Creek in the name of the winery or the vineyard.  Now what it sounds like you are saying is that your "exctractor", whatever that is, gets stuff wrong.  But you rely on it.  I can only assume that Humboldt is not the only place that this happens.  So, we can only expect Snooth to have a certain amount of wrong info now and forever.  And that this OK with you.  Fair enough.  If that is how your system works, then I guess I can just take whatever is here with a grain of salt, since putting up wrong info is part of how the site works. 

Nice folks here in the discussion though.

Reply by Mark Angelillo, Jul 26, 2011.

Hi Andrew - thanks, now I understand what is bugging you. If you go back and read the beginning of the thread you will find a more thorough explanation from Greg.

We've done our best to clean up Humboldt for now and added some rules and exceptions to catch future errors. We'll continue to work at it.

Reply by Andrew46, Nov 21, 2011.


I had been away from Snooth for a while and have just returned.  This issue seems worse than it was before.  Is searched for Humboldt here and all I find is coates and some aussie stuff.  I go to humboldt/trinity here only 2 wines come up and they are both aussie.  I then search for our wines, using our winery name, Briceland, and our wines are there, but are all listed as out of stock.  Perhaps I need to update our wines in the system, but the search still does not work.  Thought?

Reply by Chris Carpita, Nov 21, 2011.

@Andrew46 - we're releasing some pretty significant changes to search very soon - it should at least help us see the data more clearly and pick out the specifics of what's going on here, so we can clean up the extraction rules

Reply by Andrew46, Dec 27, 2011.

Hi Snooth People,

I dropped back by to see what is new here.  I figured I'd check and see if the search is working now.  


In the Humboldt search, I find wineries from Chile, Paso Robles, Australia and at least one other place that I don't recall at the moment.  Some have Humboldt or Willow Creek in the name, but some don't.

Seriously.  I like many of the people here and appreicate their hard work and love of wine.  But I have trouble devoting time to posting and sharing info on a site that promotes itself as a place to search for wines when the search is wrong this much.  Will this every be resolved?  If not, tell me now so I can stop checking.  

Sorry to be harsh, but I think you have had ample time to address this.

