Full text research queries

teze · January 27, 2016, 8:00pm

Hello,

I am very new to Aerospike, I installed it on a server and made some tests using the C# client and it looks very promising.

I am currently working on a project involving high amount of data with a big amount of reads and inserts. Part of this project includes saving in the database millions of text articles (coming, for instance, from RSS feeds, with an average of 500 characters each) each week. Then each entry would be analyzed to match possible tags and topics, and other various statistical stuff.

The tag matching part would probably need some kind of full text searching tool (kindda like the MATCH AGAINST or LIKE statements from SQL), but I cannot find anything that seems related to this in the Aerospike features. Is there any full text research tool available, or is it possible to link it to something existing? Do you think Aerospike would be adapted to this kind of text articles storage?

Thanks a bunch, and sorry if this has already been asked, couldn’t find anything!

Cheers

Pepe_Cerezo · April 19, 2016, 3:19pm

mmm, from Jan 27, and no answer or surround… I have the same question or problem.

Should I migrate to Aerospike my Mysql database? If I can’t make querys like: select * from table where column like ‘%hello%’… Maybe is not my best choise.

The only solution I can see now is get all the bins and make the search of the text in the client language (PHP, Java, etc). There all the efficiency are lost.

rbotzer · April 19, 2016, 4:42pm

I’ve answered this same question on StackOverflow, but here’s the gist:

You should not be doing full text searching with a SQL LIKE operator on an RDBMS, anyway. This is where a system like Apache Solr makes sense. In Aerospike, you would use a stream UDF to implement this functionality, if you insist on having it.

To search for articles that has a given tag, or all the tags for an article, you would just model your data in a way that makes sense, without needing to do a LIKE search.

For example a set article-tags, where the PK is the articleID and has a bin tags which grows with either a list of strings, or a list of tag IDs. For the orthogonal search you would have a set tag-articles that has a bin containing the list of article IDs for the given PK=tagID . In both cases you use the list append() operation to add values to those lists.

Alternatively, you can have a bin tags in the set articles which is a list of tags for the article. You can then build a secondary index on that list for searches.

Pepe_Cerezo · April 22, 2016, 4:42am

Thanks rbotzer for your answer. It’s a good technique what you propose. Although I’m still need to search for text. I have an application that search for words or parts of words.

I think that a MongoDb suits better for this purpose and still be a NoSQL database.

Best regards.

Topic		Replies	Views
Text Search Engine	0	1363	February 5, 2016
Mysql and Aerospike Aerospike and other Databases php	4	3731	March 2, 2020
Solr search functionality How Developers Are Using Aerospike	3	2575	May 19, 2017
Using Aerospike for Search Bar Autocomplete How Developers Are Using Aerospike	0	1509	March 9, 2016
Does Aerospike support Query Operator 'Like'	3	2538	June 5, 2019

Full text research queries

Related topics