Below is an application written in ASP.Net MVC using NEST library and ingest-attachment plugin. Application uploads file to filesystem and index it on elasticsearch. Then using search query it finds the searched text into indexed documents and access the file for download.
NEST is a high-level Elasticsearch client for .NET. It can be installed from the Package Manager Console inside Visual Studio using
Install–Package NEST
or we can install it from Nuget Package manager like below:
On previous post we installed ingest-attachment for indexing documents in elasticsearch. Now we will see this. First How we use a client in our code?
var documentsIndex = "documents"; var connectionSettings = new ConnectionSettings().InferMappingFor(m => m.IndexName(documentsIndex)); var client = new ElasticClient(connectionSettings);
Shards and replicas are important. If we want to set these count we need and indexSettings:
var indexSettings = new IndexSettings(); indexSettings.NumberOfReplicas = 1; indexSettings.NumberOfShards = 3;
we can take advantage of a feature within NEST known as automapping to simplify this mapping further; Automapping will infer the document mapping to send to Elasticsearch based on the types of the properties on the POCO
var indexResponse = client.CreateIndex(documentsIndex, c => c .Settings(s => s .Analysis(a => a .Analyzers(ad => ad .Custom(“windows_path_hierarchy_analyzer”, ca => ca .Tokenizer(“windows_path_hierarchy_tokenizer”) )) .Tokenizers(t => t .PathHierarchy(“windows_path_hierarchy_tokenizer”, ph => ph .Delimiter('\\') )))) .Mappings(m => m .Map(mp => mp .AutoMap() .AllField(all => all .Enabled(false) ) .Properties(ps => ps .Text(s => s .Name(n => n.Path) .Analyzer(“windows_path_hierarchy_analyzer”) ) //.Object(a => a //.Name(n => n.Attachment) //.AutoMap() // ) .Object(a => a .Name(n => n.Attachment) .Properties(p => p .Text(t => t .Name(n => n.Name) ) .Text(t => t .Name(n => n.Content) .TermVector(TermVectorOption.WithPositionsOffsets) .Store(true) ) .Text(t => t .Name(n => n.ContentType) ) .Number(n => n .Name(nn => nn.ContentLength) ) .Date(d => d .Name(n => n.Date) ) .Text(t => t .Name(n => n.Author) ) .Text(t => t .Name(n => n.Title) ) .Text(t => t .Name(n => n.Keywords) ) ))))));
you can look the mapping with GetMapping<>() like this
var mappingResponse = client.GetMapping();
now lets create a pipeline
client.PutPipeline(“attachments”, p => p .Description(“Document attachment pipeline”) .Processors(pr => pr .Attachment(a => a .Field(f => f.Content) .TargetField(f => f.Attachment) ) .Remove(r => r .Field(f => f.Content) )));
to create a document class for indexing
public class Document { public DateTime VerdictDate { get; set; } public int VerdictYear { get; set; } public int VerdictNo { get; set; } public string VerdictSubject { get; set; } public string Path { get; set; } public string Content { get; set; } public Attachment Attachment { get; set; } }
for indexing we will use client.index
var directory = Directory.GetCurrentDirectory(); var base64File = Convert.ToBase64String(File.ReadAllBytes(Path.Combine(directory, "20171110.pdf"))); client.Index(new Document { VerdictNo = 23, VerdictDate = DateTime.Now, VerdictSubject = “Karar Konulu Konulu”, VerdictYear = DateTime.Now.Year, Path = @"E:\Development\Elasticsearch\Elasticsearch\ES.Console\bin\Debug\20171110.pdf", Content = base64File }, i => i.Pipeline("attachments"));
to creating a search query with highlights I used fast vector highligter as below
var searchResponse = client.Search(s => s .Query(q => q .Match(m => m .Field(a => a.Attachment.Content) .Query("ANKARA") )) .Highlight(h => h .PreTags("") .PostTags("") .Encoder("html") .Fields( fs => fs .Field(p => p.Attachment.Content) .Type(HighlighterType.Fvh) .PreTags("") .PostTags("") .NumberOfFragments(3) .BoundaryMaxScan(50) .PhraseLimit(10) .HighlightQuery(q => q .Match(m => m .Field(p => p.Attachment.Content) .Query("ANKARA") )))));
for take highlihts you can see include response Hits
var highlightsInEachHit = searchResponse.Hits.Select(s => s.Highlights);
İlk Yorumu Siz Yapın