A job completion plugin for ElasticSearch Alejandro Sanchez [email protected] Index 1. Introduction 2. ElasticSearch 3. MareNostrumIII solution 4. Plugin goals 5. Plugin development 6. Production Integration 7. Future work 8. References and conclusions Introduction and motivation • BSC-CNS (Barcelona Supercomputing Center) • Officially constituted in April 2005 • Variety of clusters: • MareNostrumIII • 48,896 cores (3,056 compute nodes), 103.5TB of main memory • IBM Plaftorm LSF • MinoTauro (GPU based), CNAG (genomics), BSCCV (life), ... • Different sizes/configurations • SLURM • Research, develop and manage IT in order to ease scientific progress • Special dedication in some areas: • Computer Sciences, Life Sciences, Earth Sciences and Computational Applications Introduction and motivation • Who make use of the clusters? • How we divide cpu hours among projects? Sharing depends on the cluster MN3 (Partnership for Advanced Computing in Europe) PRACE 70% cluster 6% 24% (Red Española de Supercomputación) RES BSC projects … prace … bsc_ls class_a class_b class_c bsc_cs queues bsc_es Introduction and motivation • We need to ensure that the cpu usage distribution among projects meets the agreement • Analyzing data about finished jobs gives us a very valuable information o Correlations between users time_limit and elapsed time o Statistical information about projects, groups, users and how do their executions finish • Use the results for o Make corrections to the scheduling configuration o Train users on how to properly submit jobs o Accounting purposes There is a NEED to store historical data about finished jobs ElasticSearch basics “Elasticsearch is a flexible and powerful open source, distributed, real-time search and analytics engine.” Features: • Real-time data • Distributed • High-availability • Document oriented (JSON) • RESTful API • Schema free • Based on Apache Lucene www.elasticsearch.org ElasticSearch basics Structure: • Cluster: “collection of one or more nodes (servers) that together holds your entire data” • Node: “single server that is part of your cluster” • Index: “collection of documents that have somewhat similar characteristics” • Type: “whithin and index, you can define one or more types (logical category/partition)” • Document: “basic unit of information that can be indexed, expressed in JSON format” • Shard: “subdivision of an index” MareNostrumIII solution Network mbatchd events_log inotify tcp Monitoring Server tcp logstash event pipe index netcat lsb.acct Scheduling Server with LSF uses ElasticSearch presents job historical data Kibana web browser access httpd request Plugin goals • Rest of BSC clusters use SLURM • Make it generic, following the SLURM guidelines • Current jobcomp plugins didn’t satisfy our needs mysql, filetxt or script elasticsearch Finished job data, 37 fields: slurmctld index job data ElasticSearch account alloc_node cluster cpu_hours cpus_per_task derived_exitcode elapsed eligible_time end_time excluded_nodes exitcode gres_alloc gres_req group_id groupname jobid nodes ntasks ntasks_per_node orig_dependency parent_accounts partition qos reservation_name script start_time state std_err std_in std_out submit_time time_limit total_cpus total_nodes user_id username work_dir Plugin development • Operations against elasticsearch server are executed through HTTP requests/responses • Request pattern: $ curl -X<VERB> '<PROTOCOL>://<HOST>/<PATH>?<QUERY_STRING>' -d '<BODY>' • Request to index a document, example: $ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{ "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elasticsearch" }' • Plugin uses libcurl-devel library to handle requests/responses autoconf files have been added plugin not installed unless library detected and usable Plugin development • Plugin can be enabled and configured in slurm.conf JobCompType=jobcomp/elasticsearch JobCompLoc=http://YOURELASTICSERVER:9200 • So plugin has to check that server referenced by the configured URL is reachable and accessible • How? by capturing and parsing the HTTP response headers received from the server side Plugin development • Example of a properly indexed document’s response: HTTP/1.1 201 Created Content-Type: application/json; charset=UTF-8 Content-Length: 92 {"_index":"someindex","_type":"sometype","_id": "fsAx6qXcQGCSrY1DWvQACw","_version":1,"created" :true} • Just the header is needed (not the body). So libcurl parameters are configured to just capture the headers • Specifically, plugin checks whether the status code is 200 (OK) or 201 (Created) Plugin development • Different sources of failure: server unavailable, index readonly, etc. • Example of a document not indexed: HTTP/1.1 403 Forbidden Content-Type: application/json; charset=UTF-8 Content-Length: 96 {"error":"ClusterBlockException[blocked by: [FORBIDDEN/5/index read-only (api)];]","status":403} • Does it mean that every status code different to 200 or 201 indicates a failure? … NO (corner case found while testing) HTTP/1.1 100 Continue HTTP/1.1 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Cotent-Type: application/json Used to determine if the origin server is willing to accept the request (based on the headers) before the client sends the body Plugin development • What happens with job data that can’t be indexed? Plugin manages a memory structure to take into account the data information about pending jobs job0 data job1 data job2 data job3 data job4 data … jobN-1 data Data coherence between memory structure and state file typedef struct { uint32_t nelems; char **jobs; } pending_jobs_t StateSaveLocation/elasticsearch_state Data saved in network byte order, using SLURM functions: pack_str_array() safe_unpackstr_array() Plugin development • When does the plugin try to reindex the pending jobs? 1. When the plugin is loaded pending_jobs_t job0 data _load_pending_jobs() job1 data _index_retry() … elasticsearch_state jobN-1 data 2. Just after a successfully indexed job _index_retry() ElasticSearch Production integration • A web-layer has been added (Kibana) Configurable dashboards, time-based comparisons Make sense of your data, create bar, line and scatter plots Flexible interface, easy to share Powerful search syntax and easy-setup Production integration • Plugin already running in MinoTauro cluster • 126 compute nodes, GPU based • 2 login nodes • Planned integration in CNAG cluster in future months • Genomics analysis and research • 100 compute nodes, 20 HiMem nodes • 2 login nodes • Same with the rest of BSC SLURM clusters • BSCCV, Altix2 UV100, etc. Production integration • Kibana global view Production integration • Zoom in/out time range • Expand job data details • Search, filter, pagination… Future work (basic statistics) • Elapsed time vs project/qos • Mins, Maxs, Means, Std-devs, … Future work (Machine Learning) • Simple prediction methdos (Linear Regression) • time_limit prediction based on submit parameters 𝑌𝑡 = 𝛽0 + 𝛽1 𝑋1 + 𝛽2 𝑋2 + … + 𝛽𝑝 𝑋𝑝 + 𝜀 𝑌𝑡 measured or dependant variable 𝑋𝑖 input or independent variables 𝛽𝑖 regression coefficients 𝜀 error term Helps improving Backfill scheduling (more efficient usage of cluster resources) A submit plugin could be developed applying the prediction formula There are more complex models, using decision trees or combining different models into one Future work (Machine Learning) References and conclusions • SLURM reference to the plugin http://slurm.schedmd.com/download.html • Github repository https://github.com/asanchez1987/jobcomp-elasticsearch • Possible merge in future stable releases • Final Master Thesis, university-company context o Bacelona School of Informatics, www.fib.upc.edu/en o Barcelona Supercomputing Center, www.bsc.es
© Copyright 2026 Paperzz