
Image-Based Product Characterization
When shopping at online stores, consumers often want to filter a set of results according to some criterion - for instance, seeing only those sneakers that have velcro straps or only those high-heeled shoes that have pointy toes. While some of this information is provided to the online store by product suppliers, other product characteristics are not, which means consumers have to search through the product images manually. Based on the suggestion of a friend who works at Amazon.com, we decided to explore how well computers could tag products automatically using only some examples of images with and without the desired trait.
We applied a standard approach from the image-classification literature called "bag of visual words." This involves computing a vector (a descriptive list of numbers) at different locations on an image and deciding, for each vector, what general "class" of vectors it most resembles. Each class is implicitly associated with a word so that we end up with a group of "visual words" that describe each image. Given a product we've never seen before, we can look at the "visual words" that describe it, see how similar they are to the words describing the images we have seen, and give the new product the label (e.g., "pointy toe" or "nonpointy toe") of the known images to which it's most similar. Using roughly 70 examples of "pointy toe" and "nonpointy toe" shoes, our system achieves over 90% accuracy.
|
|
Brian Tomasik '09
|
Phyo Thiha '09
|
|