Case study
Food Ontology
Improving product categorization using Natural Language Processing for Latam largest food delivery service working in 9+ countries
Learn More
stanford
PedidosYa is part of Delivery Hero and they are the market leader for food delivery in LATAM. They are located in multiple countries in the region and are expanding by buying competitors like Glovo.
FoodFood
rocket
The goal
The client has a vast and diverse food catalog of all restaurants in the region that offer their services through PedidosYa application.
We were given the task to improve existing categorization and to extract further structured information that could allow PedidosYa improve their search results, recommendations and decisions. It was a typical natural language processing problem.
rocket
The Data
In this case data was vast but it was not labeled. Labelling the entire dataset was not a possibility because of time constraints. Natural language complexities were abundant as data was inputted by small restaurant owners following, and language variations from country to country only made things worse.
What was a sandwich in some locations was an emparedado in others and when some ice cream shops sell by kilogram, others sell by litre. Although french fries are usually a side dish, if sold alone they can be a plate you share with others and what is called Peruvian cuisine in Argentina is just a typical plate in Peru.
rocket
Our Solution
Alongside PedidosYa's great team we designed and built a data processing pipeline that was able to apply natural language processing combined with classifiers in multiple stages. Accuracy of the pipeline components reached +94%.
Embeddings were used to extract tags from individual components and XGBoost and Catboost were used at different stages to detect multiple components in a single item and to correctly classify each of them.
diagram
diagram
rocket
Technology
We built a data pipeline using Google’s Dataproc in Python that could have multiple inputs and outputs to adapt to changing architecture definitions and to facilitate future experimentation.
The system is capable of inputting and outputting information from Google’s Big Query or MongoDB among others.
google-cloudpython
Results
validationrecords
Do you want to know more? Contact us.
Shape of the picture
Get in touch with one of our specialists.
Let's discover how we can help you
Training, developing and delivering machine learning models into production
Contact us
Shape of the picture