Company merger is never an easy thing to do, from legal and economic point of view but also from data perspective. The main problem we tackle in this case study is category alignment after merging two databases.
Every company or business sector has its own taxonomy. Each taxonomy schema consists of top level categories and one or more subcategory levels which, overall, could be expanded to a couple of thousand categories. To perform taxonomy mapping requires a good understanding of what each category represents not to mention the need to “categorize” a ton of data instances where one instance can be associated with multiple categories. This is usually done manually…
Due to the reasons listed above, categorised data can contain irregularities and mismatches which can seem subtle, but usually have a huge impact on overall data quality.
Let’s take a look at two simple use cases where PlaceLab helped our client to categorise data and to detect miscategorisation in a big dataset.
The client had a couple of millions data units, mostly places of interests and they had to merge with another company. The company that they acquired had a dataset containing a lot of different point of interests, from restaurants and accommodation to tourist attractions. Acquired company had different and more granular category schema, for instance, in their database McDonalds was tagged as fast food and in client’s database as restaurant. Before merging two databases, they needed to remap all new data records and they used PlaceLab to automatise the process. PlaceLab was trained to understand entire client’s category schema, from top to low level categories and it took about couple of minutes to categorise or remap entire acquired dataset. Amazing right?
Using the same service, Category Standardiser, miscategorizations were detected in new database which probably caused a lot of confusion and bad user experience. One example of bad categorisation was that all swimming schools were categorised as education while on the other side all diving schools as sport category. If users were browsing through sport categories and looking for swimming schools they would never find it as they would not be listed in the results.
Category mapping takes a lot of time and resources. However, when you PlaceLab it, it does the job in a matter of seconds. Imagine a 10,000 or 100,000 POI data base that you would have to verify and remap, all the while ensuring that there are no mistakes. That would be nearly impossible. But with PlaceLab, we saved our clients a lot of time. We can do the same for you. Within just a few moments PlaceLab can find all the wrong categories and align them to either PlaceLab category system, or to the system of your choice.