İçeriğe geç

Elasticsearch Part 2 – NEST Client

Below is an application written in ASP.Net MVC using NEST library and ingest-attachment plugin. Application uploads file to filesystem and index it on elasticsearch. Then using search query it finds the searched text into indexed documents and access the file for download.

NEST is a high-level Elasticsearch client for .NET. It can be installed from the Package Manager Console inside Visual Studio using

InstallPackage NEST

or we can install it from Nuget Package manager like below:

On previous post we installed ingest-attachment for indexing documents in elasticsearch. Now we will see this. First How we use a client in our code?

var documentsIndex = "documents";
var connectionSettings = new ConnectionSettings().InferMappingFor(m => m.IndexName(documentsIndex));
var client = new ElasticClient(connectionSettings);

Shards and replicas are important. If we want to set these count we need and indexSettings:

var indexSettings = new IndexSettings();
indexSettings.NumberOfReplicas = 1; 
indexSettings.NumberOfShards = 3;

we can take advantage of a feature within NEST known as automapping to simplify this mapping further; Automapping will infer the document mapping to send to Elasticsearch based on the types of the properties on the POCO

var indexResponse = client.CreateIndex(documentsIndex, c => c
.Settings(s => s .Analysis(a => a
.Analyzers(ad => ad
.Custom(“windows_path_hierarchy_analyzer”, ca => ca
.Tokenizer(“windows_path_hierarchy_tokenizer”)
))
.Tokenizers(t => t .PathHierarchy(“windows_path_hierarchy_tokenizer”, ph => ph
.Delimiter('\\') ))))
.Mappings(m => m
.Map(mp => mp
.AutoMap()
.AllField(all => all
.Enabled(false)
)
.Properties(ps => ps
.Text(s => s
.Name(n => n.Path)
.Analyzer(“windows_path_hierarchy_analyzer”)
)
//.Object(a => a 
//.Name(n => n.Attachment)
//.AutoMap()
//
)
.Object(a => a
.Name(n => n.Attachment)
.Properties(p => p
.Text(t => t
.Name(n => n.Name)
)
.Text(t => t
.Name(n => n.Content)
.TermVector(TermVectorOption.WithPositionsOffsets)
.Store(true)
)
.Text(t => t
.Name(n => n.ContentType)
)
.Number(n => n
.Name(nn => nn.ContentLength)
)
.Date(d => d
.Name(n => n.Date)
)
.Text(t => t
.Name(n => n.Author)
)
.Text(t => t
.Name(n => n.Title)
)
.Text(t => t
.Name(n => n.Keywords) )
))))));

you can look the mapping with GetMapping<>() like this

var mappingResponse = client.GetMapping();

now lets create a pipeline

client.PutPipeline(“attachments”, p => p
.Description(“Document attachment pipeline”)
.Processors(pr => pr
.Attachment(a => a
.Field(f => f.Content)
.TargetField(f => f.Attachment)
)
.Remove(r => r
.Field(f => f.Content)
)));

to create a document class for indexing

    public class Document
    {
        public DateTime VerdictDate { get; set; }
        public int VerdictYear { get; set; }
        public int VerdictNo { get; set; }
        public string VerdictSubject { get; set; }
        public string Path { get; set; }
        public string Content { get; set; }
        public Attachment Attachment { get; set; }
    }

for indexing we will use client.index

var directory = Directory.GetCurrentDirectory();
            var base64File = Convert.ToBase64String(File.ReadAllBytes(Path.Combine(directory, "20171110.pdf")));
            client.Index(new Document {
                VerdictNo = 23, 
                VerdictDate = DateTime.Now, 
                VerdictSubject = “Karar Konulu Konulu”, 
                VerdictYear = DateTime.Now.Year, 
                Path = @"E:\Development\Elasticsearch\Elasticsearch\ES.Console\bin\Debug\20171110.pdf",
                Content = base64File
            }, i => i.Pipeline("attachments"));

to creating a search query with highlights I used fast vector highligter as below

var searchResponse = client.Search(s => s
.Query(q => q
.Match(m => m
.Field(a => a.Attachment.Content)
.Query("ANKARA") ))
.Highlight(h => h
.PreTags("")
.PostTags("")
.Encoder("html")
.Fields( fs => fs
.Field(p => p.Attachment.Content)
.Type(HighlighterType.Fvh)
.PreTags("") 
.PostTags("")
.NumberOfFragments(3)
.BoundaryMaxScan(50)
.PhraseLimit(10)
.HighlightQuery(q => q
.Match(m => m
.Field(p => p.Attachment.Content)
.Query("ANKARA")
)))));

for take highlihts you can see include response Hits

var highlightsInEachHit = searchResponse.Hits.Select(s => s.Highlights);

İlk Yorumu Siz Yapın

Bir yanıt yazın

E-posta adresiniz yayınlanmayacak. Gerekli alanlar * ile işaretlenmişlerdir