I'm working on parsing two large databases and merging them into one. Both databases have similar data between the two but not reference to each other. They been converted to JSON format so I'm using the arrays to iterate one over the other. I'm matching strings and since the data is very similar, if a match has been found, it then inserts the ID from one into the other so we now have a reference point in the final database.
The parsing is very CPU intensive since I'm using fuzzy string matching in hopes to find matching strings even if they have slight variations. Both databases have 50,000+ entries in them so running this takes a long time. ~8 Hours on my MBP.
Normally I would just let it crunch the data overnight but I want to make some changes to my algorithm and try multiple things to find the best results. So this gets to be annoying having to wait...
So, it is possible to offload some of the work to the GPU? Fuzzy string matching is matrix based so I thought it might be a good candidate for the GPU. Would adding metal help reduce the amount of time it takes?
Thanks!
The parsing is very CPU intensive since I'm using fuzzy string matching in hopes to find matching strings even if they have slight variations. Both databases have 50,000+ entries in them so running this takes a long time. ~8 Hours on my MBP.
Normally I would just let it crunch the data overnight but I want to make some changes to my algorithm and try multiple things to find the best results. So this gets to be annoying having to wait...
So, it is possible to offload some of the work to the GPU? Fuzzy string matching is matrix based so I thought it might be a good candidate for the GPU. Would adding metal help reduce the amount of time it takes?
Thanks!