BackEnd21

31 Dec 2020
by ignat

BackEnd21

 

https://drive.google.com/file/d/19hfwx_TQqHoNURyEpq4Si7Vpv2VmJ7sX/view?usp=sharing

 Composer (eng)

  1. We put the correct docker according to the documentation https://docs.docker.com/machine/install-machine/

 

  1. Install docker-compose sudo apt install docker-compose

 

  1. We put virtualbox https://tecadmin.net/install-virtualbox-on-ubuntu-18-04/

 

  1. Set https://docs.confluent.io/current/quickstart/ce-docker-quickstart.html

 

  1. Set PostgreSQL https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-ubuntu-18-04-ru

 

In PostgreSQL, we create the insikt database and restore the tables from the dump.

 

Keystore db

Also is needed restore the keystore db.

We need in our local restore it from a dump with

 

pg_restore -h {ip} -U postgres -W insikt1.sql -d insikt

 

Note: Database must be already created

 

  1. We collect images of containers, launching srun.sh in each In the backend directory, run make deploy

 

  1. After enabling the instance:

 

systemctl enable systemd-resolved

systemctl start systemd-resolved

 

  1. Check Eraser

 

https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html

 

sudo docker run -p 9200: 9200 -p 9300: 9300 -e “discovery.type = single-node” docker.elastic.co/elasticsearch/elasticsearch∗.6.15

 

  1. Adding an index to Elasticsearch

 

After this we need to go to Kibana though browser in IP_SERVER: 5601

Once Kibana is loaded go to the sidebar menu and click on Dev Tools.

You will see kibana console

 

Open kibana (/ app / kibana # / dev_tools / console? _G = ())

Dev Tools – Console

Insert script code

 

PUT demo

{

 

“settings”: {

“number_of_shards”: 6,

“number_of_replicas”: 1,

“analysis”: {

“analyzer”: {

“default”: {

“type”: “standard”,

“tokenizer”: “lowercase”,

“filter”: [

“asciifolding”

]

}

}

},

“index.requests.cache.enable”: true

},

“mappings”: {

“tweet”: {

“_source”: {

“enabled”: true

},

“properties”: {

“analysis”: {

“properties”: {

“threatScore”: {

“type”: “long”,

“doc_values”: true

},

“concepts”: {

“properties”: {

“concept”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“docSentiment”: {

“type”: “double”,

“index”: true,

“doc_values”: true

},

“emotions”: {

“properties”: {

“emotion”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“entities”: {

“type”: “nested”,

“properties”: {

“entity”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“entityType”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“type”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“hashtags”: {

“properties”: {

“text”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“keyIdeas”: {

“properties”: {

“keyIdea”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“topics”: {

“type”: “nested”,

“properties”: {

“topic”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“category”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

}

}

},

“createdAt”: {

“type”: “date”,

“index”: true,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“detectedLang”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“geoLocation”: {

“properties”: {

“latitude”: {

“type”: “double”,

“index”: true,

“doc_values”: true

},

“longitude”: {

“type”: “double”,

“index”: true,

“doc_values”: true

}

}

},

“coordinates”: {

“index”: true,

“type”: “geo_point”

},

“geoname”: {

“properties”: {

“countryCode”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“geonameid”: {

“type”: “integer”,

“index”: true,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“hashtagEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“text”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“idLong”: {

“type”: “long”,

“doc_values”: true

},

“mediaEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“mediaURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“mediaURLHttps”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

}

}

},

“place”: {

“properties”: {

“boundingBoxCoordinates”: {

“properties”: {

“latitude”: {

“type”: “double”,

“index”: true,

“doc_values”: true

},

“longitude”: {

“type”: “double”,

“index”: true,

“doc_values”: true

}

}

},

“boundingBoxType”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“country”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“countryCode”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“fullName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“placeType”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“retweetedStatus”: {

“properties”: {

“createdAt”: {

“type”: “date”,

“index”: true,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“geoLocation”: {

“properties”: {

“latitude”: {

“type”: “double”,

“doc_values”: true

},

“longitude”: {

“type”: “double”,

“doc_values”: true

}

}

},

“hashtagEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“text”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“mediaEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“mediaURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“mediaURLHttps”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

}

}

},

“place”: {

“properties”: {

“boundingBoxCoordinates”: {

“properties”: {

“latitude”: {

“type”: “double”,

“doc_values”: true

},

“longitude”: {

“type”: “double”,

“doc_values”: true

}

}

},

“boundingBoxType”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“country”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“countryCode”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“fullName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“id”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“placeType”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“source”: {

“type”: “keyword”,

“index”: true

},

“symbolEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“text”: {

“type”: “keyword”,

“index”: true

}

}

},

“text”: {

“type”: “keyword”,

“index”: true

},

“urlEntities”: {

“properties”: {

“displayURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“end”: {

“type”: “long”,

“doc_values”: true

},

“expandedURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“user”: {

“properties”: {

“createdAt”: {

“type”: “date”,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“description”: {

“type”: “keyword”,

“index”: true

},

“favouritesCount”: {

“type”: “long”,

“index”: true,

“doc_values”: true

},

“followersCount”: {

“type”: “long”,

“index”: true,

“doc_values”: true

},

“friendsCount”: {

“type”: “long”,

“index”: true,

“doc_values”: true

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“lang”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“location”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“fields”: {

“raw”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“profileImageUrl”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“statusesCount”: {

“type”: “long”,

“index”: true,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“userMentionEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“id”: {

“type”: “long”,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

}

}

}

}

},

“savedAt”: {

“type”: “date”,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“source”: {

“type”: “keyword”,

“index”: true

},

“symbolEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“text”: {

“type”: “keyword”

}

}

},

“text”: {

“type”: “keyword”,

“index”: true

},

“unifiedText”: {

“type”: “text”,

“index”: true

},

“unifiedUrls”: {

“type”: “keyword”

},

“urlEntities”: {

“properties”: {

“displayURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“end”: {

“type”: “long”,

“doc_values”: true

},

“expandedURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“user”: {

“properties”: {

“createdAt”: {

“type”: “date”,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“description”: {

“type”: “keyword”,

“index”: true

},

“favouritesCount”: {

“type”: “long”,

“doc_values”: true

},

“followersCount”: {

“type”: “long”,

“doc_values”: true

},

“friendsCount”: {

“type”: “long”,

“doc_values”: true

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“lang”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“location”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“fields”: {

“raw”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“profileImageUrl”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“statusesCount”: {

“type”: “long”,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“urlEntity”: {

“properties”: {

“displayURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“end”: {

“type”: “long”,

“doc_values”: true

},

“expandedURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

}

}

},

“userMentionEntities”: {

“properties”: {

“end”: {

“type”: “long”

},

“id”: {

“type”: “long”

},

“name”: {

“type”: “keyword”

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“start”: {

“type”: “long”

}

}

}

}

}

}

}

 

sudo apt install python-pytest python-elasticsearch

 

Inside the script, I switched the ES_CONN variable to the IP address 172.18.0.2

 

python test_smoke.py

 

Check

 

GET demo / _search

{

“query”: {

match_all: {}

}

}

 

  1. MySQL

 

There is a file in the tmp / pyalerts directory, you need to comment out the creation of indexes (if this file is used for the first time)

 

mysql-tested.sql

 

Run the command, write to the password request: test

 

mysql -h 172.18.0.4 -u test -p insikt <mysql-tested.sql

 

Then go to the MySQL console and create 2 more tables:

 

mysql -h 172.18.0.4 -u test -p insikt

 

CREATE TABLE language (id INT auto_increment PRIMARY KEY, name text, status tinyint (1));

CREATE TABLE network_analysis (id INT auto_increment PRIMARY KEY, project_id VARCHAR (256) DEFAULT NULL, start date, end date, source varchar (1000), status tinyint (4));

 

  1. We look at the result

 

sudo docker logs deploy_backend_1

 

sudo docker logs deploy_frontend_1

 

sudo docker ps

 

Container list

 

ps axf

you can see the list of converters in the process

 

docker restart – restart the container in memory

docker update – with updating parameters

 

sudo docker exec -it {container_name} bash

go inside the container

 

DB_PASS_POSTGRESQL = “Pbdivbknn123”

Setting up postgress on locahost

How to allow remote connections to PostgreSQL database server

register listen_addresses = ‘*’ in /etc/postgres/10/main/postgres.conf

 

As well as valid IP addresses from which the inputs to the BD are in the /etc/postgres/10/main/pg_hbd.conf file

 

In the docker-composet.yml file for accessing Kafke containers in broker service environment variables:

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT: // broker: 29092, PLAINTEXT_HOST: // broker: 9092

 

for external access to kafka without a container:

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT: // broker: 29092, PLAINTEXT_HOST: // localhost: 9092

 

System restart

cd deploy

sudo docker-compose down – cut down all containers

sudo docker stop {container name} – stop

sudo docker rm {container name} – delete

sudo docker-compose up -d – run all containers from the docker-composet.yml file

 

  1. Storm

https://streamparse.readthedocs.io/en/stable/quickstart.html

 

The script for the automatic installation of Storm from the local Linux system via ssh is located in the ~ / tmp / supervisord / storm / storm.py directory

 

You also need an archive with the Storm distribution version 1.0.6 – c.tar.gz

and the demon archive for running Storm called supervisord – s.tar.gz.

In addition, there is also Oracle Java – jdk-8u192-linux-x64.tar.gz

 

Both archives are saved in ~ / tmp / supervisord /

 

To install Java, use the command:

 

apt install default-jdk

 

Then for the bundle we put Oracle Java in the / usr / lib / jvm directory next to OpenJdk, sometimes the original Java is more suitable.

 

To use Oracle Java, it is enough to set the environment variable in the necessary configs:

 

JAVA_HOME = / usr / lib / jvm / jdk1.8.0_192

 

Add path PATH = ”/ opt / storm / current / bin: $ PATH” to ~ / .profile

 

To start storm type:

 

sudo /etc/init.d/supervisor start

 

To stop Stopm, the same command as above, with the word stop.

 

storm version

 

http://ip:8080/index.html

 

  1. Leiningen

 

https://github.com/technomancy/leiningen#leiningen

 

You do not have to create an additional bin directory in the ~ home directory.

cd ~

mkdir bin

 

cd ~ / bin

or

cd ~ / .local / bin

 

wget https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein

 

chmod + x ~ / bin / lein

or

chmod + x ~ / .local / bin

 

Further

 

lein version

 

  1. Streamparse

 

sudo pip3 install streamparse

 

sparse quickstart wordcount

 

cd wordcount

 

In the project.clj file, you need to change the version of Storm in line 6:

 

: dependencies [[org.apache.storm / storm-core “1.0.6”]

 

Check the performance:

 

sparse run

 

To run the task on the Storm cluster, and not on the local library, use the command (IPs configuration is required in the config.json file):

 

sparse submit

 

  1. TensorFlow

 

For the first acquaintance with the capabilities of neural networks, this link is suitable:

https://www.tensorflow.org/tutorials/keras/classification?hl=en

 

Copy the NLP_Engine_v2.tar.gz archive to any place on the disk where there is free space.

 

Unzip to the same directory. This is important as the model is voluminous:

 

tar -zxvf NLP_Engine_v2.tar.gz -C.

 

The extreme point in the command means to unzip to the current directory.

 

Then you need to install the necessary Python packages:

 

sudo pip3 install nltk numpy regex stanfordnlp joblib lmdb vaderSentiment polyglot pycld2 morfessor keras tensorflow sklearn elasticsearch pandas

 

sudo apt install python3-mysql.connector python3-pycurl

 

Then go to the NLP_Engine_v2 directory and execute the command:

 

source env / bin / activate

 

Pay attention to the nlp directory. This is the python module that analyzes.

 

After that, you need to write the paths for the local python module nlp:

 

nano nlp_engine / src / bolts / tweet_analysis.py

 

In another file, you need to correct the environment variables, in principle, if you run through the container, you need to do this in the docker-compose.yml file.

 

nano nlp_engine/src/spouts/tweets.py

 

Then go to the nlp_engine directory and run the command:

 

sparse run

 

To run on the Storm cluster, you need to adjust the IP addresses in the config.json file and use the command:

 

sparse submit

 

  1. A container to run NLP

 

To run machine analysis, go to the directory … and run srun.sh

 

  1. Clause. Additions and thoughts out loud.

 

This command can be used to deflate a site. The resulting directories can be used to test containers for parsing to isolate information.

 

To compile a map or site index, in principle, you can use the standard features of Storm.

 

If you successfully filter this content, then it can be used to train the neural network.

 

wget -m -l 10 -e robots = off -p -k -E –reject-regex “wp” –no-check-certificate -U = “Mozilla / 5.0 (Windows NT 10.0; Win64; x64) AppleWebKit / 537.36 (KHTML, like Gecko) Chrome / 68.0.3440.106 Safari / 537.36 “forum.katera.ru

 

A trained neural network can be used to generate messages to maintain dialogue. For any questions, immediately give a link to the source on this forum.

 

Composer (ru)

  1. Ставим правильный докер по документации https://docs.docker.com/machine/install-machine/

 

  1. Ставим docker-compose sudo apt  install docker-compose

 

  1. Ставим virtualbox https://tecadmin.net/install-virtualbox-on-ubuntu-18-04/

 

  1. Ставим https://docs.confluent.io/current/quickstart/ce-docker-quickstart.html

 

  1. Ставим PostgreSQL https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-ubuntu-18-04-ru

 

В PostgreSQL создаём базу данных insikt и востанавливаем таблицы из дампа.

 

Keystore db

Also is needed restore the keystore db.

We need in our local restore it from a dump with

 

pg_restore -h 75.126.254.59 -U postgres -W insikt1.sql -d insikt

 

Note: Database must be already created

 

  1. Собираем имиджи контейнеров, запуская в каждом srun.sh. В каталоге backend запускаем make deploy

 

  1. После включения инстанса:

 

systemctl enable systemd-resolved

systemctl start systemd-resolved

 

  1. Проверка Еластика

 

https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html

 

sudo docker run -p 9200:9200 -p 9300:9300 -e “discovery.type=single-node” docker.elastic.co/elasticsearch/elasticsearch:5.6.15

 

  1. Добавление индекса в Elasticsearch

 

After this we need to go to Kibana though browser in  IP_SERVER:5601

Once Kibana is loaded go to the sidebar menu and click on Dev Tools.

You will see kibana console

 

Открываем kibana (/app/kibana#/dev_tools/console?_g=())

Dev Tools – Console

Вставляем код скрипта

 

PUT demo

{

 

“settings”: {

“number_of_shards”: 6,

“number_of_replicas”: 1,

“analysis”: {

“analyzer”: {

“default”: {

“type”: “standard”,

“tokenizer”: “lowercase”,

“filter”: [

“asciifolding”

]

}

}

},

“index.requests.cache.enable”: true

},

“mappings”: {

“tweet”: {

“_source”: {

“enabled”: true

},

“properties”: {

“analysis”: {

“properties”: {

“threatScore”: {

“type”: “long”,

“doc_values”: true

},

“concepts”: {

“properties”: {

“concept”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“docSentiment”: {

“type”: “double”,

“index”: true,

“doc_values”: true

},

“emotions”: {

“properties”: {

“emotion”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“entities”: {

“type”: “nested”,

“properties”: {

“entity”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“entityType”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“type”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“hashtags”: {

“properties”: {

“text”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“keyIdeas”: {

“properties”: {

“keyIdea”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“topics”: {

“type”: “nested”,

“properties”: {

“topic”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“category”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

}

}

},

“createdAt”: {

“type”: “date”,

“index”: true,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“detectedLang”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“geoLocation”: {

“properties”: {

“latitude”: {

“type”: “double”,

“index”: true,

“doc_values”: true

},

“longitude”: {

“type”: “double”,

“index”: true,

“doc_values”: true

}

}

},

“coordinates”: {

“index”: true,

“type”: “geo_point”

},

“geoname”: {

“properties”: {

“countryCode”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“geonameid”: {

“type”: “integer”,

“index”: true,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“hashtagEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“text”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“idLong”: {

“type”: “long”,

“doc_values”: true

},

“mediaEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“mediaURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“mediaURLHttps”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

}

}

},

“place”: {

“properties”: {

“boundingBoxCoordinates”: {

“properties”: {

“latitude”: {

“type”: “double”,

“index”: true,

“doc_values”: true

},

“longitude”: {

“type”: “double”,

“index”: true,

“doc_values”: true

}

}

},

“boundingBoxType”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“country”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“countryCode”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“fullName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“placeType”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“retweetedStatus”: {

“properties”: {

“createdAt”: {

“type”: “date”,

“index”: true,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“geoLocation”: {

“properties”: {

“latitude”: {

“type”: “double”,

“doc_values”: true

},

“longitude”: {

“type”: “double”,

“doc_values”: true

}

}

},

“hashtagEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“text”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“mediaEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“mediaURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“mediaURLHttps”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

}

}

},

“place”: {

“properties”: {

“boundingBoxCoordinates”: {

“properties”: {

“latitude”: {

“type”: “double”,

“doc_values”: true

},

“longitude”: {

“type”: “double”,

“doc_values”: true

}

}

},

“boundingBoxType”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“country”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“countryCode”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“fullName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“id”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“placeType”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“source”: {

“type”: “keyword”,

“index”: true

},

“symbolEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“text”: {

“type”: “keyword”,

“index”: true

}

}

},

“text”: {

“type”: “keyword”,

“index”: true

},

“urlEntities”: {

“properties”: {

“displayURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“end”: {

“type”: “long”,

“doc_values”: true

},

“expandedURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“user”: {

“properties”: {

“createdAt”: {

“type”: “date”,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“description”: {

“type”: “keyword”,

“index”: true

},

“favouritesCount”: {

“type”: “long”,

“index”: true,

“doc_values”: true

},

“followersCount”: {

“type”: “long”,

“index”: true,

“doc_values”: true

},

“friendsCount”: {

“type”: “long”,

“index”: true,

“doc_values”: true

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“lang”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“location”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“fields”: {

“raw”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

}

}

},

“profileImageUrl”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“statusesCount”: {

“type”: “long”,

“index”: true,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“userMentionEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“id”: {

“type”: “long”,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

}

}

}

}

},

“savedAt”: {

“type”: “date”,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“source”: {

“type”: “keyword”,

“index”: true

},

“symbolEntities”: {

“properties”: {

“end”: {

“type”: “long”,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“text”: {

“type”: “keyword”

}

}

},

“text”: {

“type”: “keyword”,

“index”: true

},

“unifiedText”: {

“type”: “text”,

“index”: true

},

“unifiedUrls”: {

“type”: “keyword”

},

“urlEntities”: {

“properties”: {

“displayURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“end”: {

“type”: “long”,

“doc_values”: true

},

“expandedURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“user”: {

“properties”: {

“createdAt”: {

“type”: “date”,

“doc_values”: true,

“format”: “dateOptionalTime”

},

“description”: {

“type”: “keyword”,

“index”: true

},

“favouritesCount”: {

“type”: “long”,

“doc_values”: true

},

“followersCount”: {

“type”: “long”,

“doc_values”: true

},

“friendsCount”: {

“type”: “long”,

“doc_values”: true

},

“id”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“lang”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“location”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“name”: {

“type”: “keyword”,

“index”: true,

“fields”: {

“raw”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

},

“profileImageUrl”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“statusesCount”: {

“type”: “long”,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“urlEntity”: {

“properties”: {

“displayURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“end”: {

“type”: “long”,

“doc_values”: true

},

“expandedURL”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

},

“start”: {

“type”: “long”,

“doc_values”: true

},

“url”: {

“type”: “keyword”,

“index”: false,

“doc_values”: true

}

}

}

}

},

“userMentionEntities”: {

“properties”: {

“end”: {

“type”: “long”

},

“id”: {

“type”: “long”

},

“name”: {

“type”: “keyword”

},

“screenName”: {

“type”: “keyword”,

“index”: true,

“doc_values”: true

},

“start”: {

“type”: “long”

}

}

}

}

}

}

}

 

sudo apt install python-pytest python-elasticsearch

 

Внутри скрипта переключил переменную ES_CONN на IP адресс 172.18.0.2

 

python test_smoke.py

 

Проверка

 

GET demo/_search

{

“query”: {

“match_all”: {}

}

}

 

  1. MySQL

 

В директории tmp/pyalerts есть фай, надо закоментить создание индексов (если этот файл используеться впервые)

 

mysql-tested.sql

 

Выполнить команду, на запрос пароля написать : test

 

mysql -h 172.18.0.4 -u test -p insikt < mysql-tested.sql

 

Затем зайти в консоль MySQL и создать еще 2 таблицы:

 

mysql -h 172.18.0.4 -u test -p insikt

 

CREATE TABLE language (id INT auto_increment PRIMARY KEY, name text, status tinyint(1));

CREATE TABLE network_analysis (id INT auto_increment PRIMARY KEY,project_id VARCHAR(256) DEFAULT NULL, start date, end date, source varchar(1000), status tinyint(4));

 

  1. Смотрим результат

 

sudo docker logs deploy_backend_1

 

sudo docker logs deploy_frontend_1

 

sudo docker ps

Список контейнеров

 

ps axf

можно посмотреть список контерйнеров в процессе

 

docker restart – рестарт контейнера в памяти

docker update – с обновлением параметров

 

sudo docker exec -it {container_name} bash

заходим внутрь контейнера

 

DB_PASS_POSTGRESQL = “Pbdivbknn123”

Настройка postgress на locahost

https://bosnadev.com/2015/12/15/allow-remote-connections-postgresql-database-server/

прописать listen_addresses = ‘*’ в /etc/postgres/10/main/postgres.conf

 

А так же допустимые IP адреса с которых входм в BD в файле /etc/postgres/10/main/pg_hbd.conf

 

В файле docker-composet.yml для доступа к Kafke контейнеров в переменных окружения сервиса broker:

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://broker:9092

 

для внешнего доступа к кафке без контейнера :

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092

 

Перезапуск системы

cd deploy

sudo docker-compose down – вырубаем все контейнеры

sudo docker stop {имя контейнера} – остановить

sudo docker rm {имя контейнера} – удалить

sudo docker-compose up -d – запускаем все контейнеры из docker-composet.yml файла

 

  1. Storm

https://streamparse.readthedocs.io/en/stable/quickstart.html

 

Скрипт для автоматической установки Storm с локальной Линукс системы через ssh находиться в каталоге ~/tmp/supervisord/storm/storm.py

 

Так же вам нужен архив с дистрибутивом Storm версии 1.0.6 – c.tar.gz

и архив демона для запуска Storm, который называеться supervisord – s.tar.gz.

Кроме того, там же находиться Oracle Java – jdk-8u192-linux-x64.tar.gz

 

Оба архива сохранены в ~/tmp/supervisord/

 

Для установки Java используем команду:

 

apt install default-jdk

 

Затем для комплекта положим Oracle Java в каталог /usr/lib/jvm рядом c OpenJdk, иногда оригинальный Java больше подходит.

 

Для использования Oracle Java достаточно задать переменную среды окружения в нужных конфигах:

 

JAVA_HOME=/usr/lib/jvm/jdk1.8.0_192

 

Добавить путь PATH=”/opt/storm/current/bin:$PATH” в файл ~/.profile

 

Для старта storm набрать:

 

sudo /etc/init.d/supervisor start

 

Для остановки Stopm такаяже команда как выше, со словом stop.

 

storm version

 

http://75.126.254.59:8080/index.html

 

  1. Leiningen

 

https://github.com/technomancy/leiningen#leiningen

 

Можно не создавать дополнительный каталог bin в домашнем каталоге ~.

cd ~

mkdir bin

 

cd ~/bin

или

cd ~/.local/bin

 

wget https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein

 

chmod +x ~/bin/lein

или

chmod +x ~/.local/bin

 

далее

 

lein version

 

  1. Streamparse

 

sudo pip3 install streamparse

 

sparse quickstart wordcount

 

cd wordcount

 

В файле project.clj надо изменить версию Storm в 6 строчке:

 

:dependencies  [[org.apache.storm/storm-core “1.0.6”]

 

Проверим работоспособность:

 

sparse run

 

Для запуска задачи на кластере Storm, а не на локальной библиотеке используют команду (требуеться конфигурация IPs в файле config.json ):

sparse submit

 

  1. TensorFlow

 

Для первого знакомство с возможностями нейросетей подойдет вот эта ссылка:

https://www.tensorflow.org/tutorials/keras/classification?hl=ru

 

Скопировать архив NLP_Engine_v2.tar.gz в любое место на диске, где есть свободное место.

 

Разархивировать в той же директории. Это важно так как модель объёмная:

 

tar -zxvf NLP_Engine_v2.tar.gz -C .

 

Крайняя точка в команде обозначает разорхивировать в текущую директорию.

 

Затем надо установить необходимы Питон пакеты:

 

sudo pip3 install nltk numpy regex stanfordnlp joblib lmdb vaderSentiment polyglot pycld2 morfessor keras tensorflow sklearn elasticsearch pandas

 

sudo apt install python3-mysql.connector python3-pycurl

 

Затем зайти в каталог NLP_Engine_v2 и выполнит команду:

 

source env/bin/activate

 

Обратите внимание на каталог nlp. Это модуль питона который занимаеться анализом.

 

После этого надо прописать пути для местного питон модуля nlp:

 

nano nlp_engine/src/bolts/tweet_analysis.py

 

В другом файле надо подправить переменные среды окружения, в принципе, если запускать через контейнер, это нужно делать в docker-compose.yml файл.

 

nano nlp_engine/src/spouts/tweets.py

 

Затем перейдите в каталог nlp_engine и выполните команду:

 

sparse run

 

Для запуска на кластере Storm, надо корректировать IP адреса в файле config.json и использовать команду:

 

sparse  submit

 

  1. Контейнер для запуска NLP

 

Для запуска машинного анализа заходим в каталог … и запускаем srun.sh

 

docker-compose.yml

 

storm:
image: storm:latestvolumes:- /home/ubuntu/deploy/storm/suite:/suiteenvironment:- POSTGRES_HOST=postgresql- POSTGRES_PORT=5432

– POSTGRES_DBNAME=inviso

– POSTGRES_USER=postgres

– POSTGRES_PASS=demo

restart: always

networks:

default:

ipv4_address: 172.18.0.25

 

Dockerfile

 

FROM ubuntu

RUN mkdir -p /home/ubuntu

WORKDIR /home/ubuntu

RUN apt-get -y -q update && apt-get install -y wget sudo git libicu-dev python htop curl python3 python3-pip python3-pycurl

#INSTALL JAVA 8

COPY jdk1.8.0_192 jdk1.8.0_192

RUN sudo mkdir -p /usr/lib/jvm

RUN sudo ln -s /home/ubuntu/jdk1.8.0_192 /usr/lib/jvm/java-8-oracle

#STORM

COPY apache-storm-1.0.6 apache-storm-1.0.6

COPY nlp nlp

COPY nlp_engine nlp_engine

RUN sudo echo ‘JAVA_HOME=”/usr/lib/jvm/java-8-oracle”‘ >> /etc/environment

RUN echo ‘PATH=”/home/ubuntu/apache-storm-1.0.6/bin:$PATH”‘ >> /etc/environment

RUN sudo pip3 install -U git+https://github.com/aboSamoor/polyglot.git@master

RUN sudo pip3 install mysql-connector kafka-python streamparse  flask nltk numpy regex stanfordnlp joblib lmdb vaderSentiment pycld2 morfessor keras tensorflow sklearn elasticsearch pandas

WORKDIR /home/ubuntu

RUN wget https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein

RUN sudo chmod +x lein

RUN sudo cp /home/ubuntu/lein /usr/local/bin

ENV JAVA_HOME /usr/lib/jvm/java-8-oracle

ENV JAVACMD /usr/lib/jvm/java-8-oracle/bin/java

ENV PATH “/home/ubuntu/apache-storm-1.0.6/bin:/usr/lib/jvm/java-8-oracle/bin/:$PATH”

ENV LEIN_ROOT true

COPY start.sh start.sh

CMD bash start.sh

 

start.sh

 

#!/bin/bash

storm version

lein version

sparse quickstart wordcount

sleep 50000

 

  1. Пункт. Дополнения и мысли вслух.

 

Эту команду можно применить для выкачивания сайта. Полученные директории можно использовать для тестирования контейнеров по парсингу для вычленения информации.

 

Для составления карты или индекса сайта, в принципе можно использовать стандартные возможности Storm.

 

Если удачно отфильтровать этот контент, то его можно использовать для обучения нейросетки.

 

wget -m -l 10 -e robots=off -p -k -E –reject-regex “wp” –no-check-certificate -U=”Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36″ forum.katera.ru

 

Обученную нейросетку, можно использовать для генерации сообщений для поддержания диалога. При любых вопросах сразу давать ссылку на первоисточних на этом форуме.

 

 

Swarm

Categories (3) Deploy Inviso Created 12 days ago, last updated 8 days ago
Clone repository git clone https://github.com/InsiktIntelligence/insikt-deploy-entrega.git

Create Swarm Now we need to create the swarm nodes (master and slaves) and join them as a cluster Go to insikt-deploy-entrega directory and execute swarm.sh cd insikt-deploy-entrega/ bash swarm.sh
Note: Is recommended that all swarms must have 8GB of memory RAM. Actually in the file some nodes on the swarm have just 4GB because there’s not enough memory on the server which is deployed. Check result:

docker-machine ls

Expected output: Image

 

Swarm deployment We need to execute swarm.sh file. For this step we need: Configure aws CLI Match swarm certificates directory Check out in which directory is getting configuration source

Configure aws CLI Just execute the next command and follow the steps aws configure

Match swarm certificates directory

In the deploy.sh file, is needed that next constant can search swarm-1 cert

export DOCKER_CERT_PATH=”ABSOLUTE_PATH_TO_CERT”
Note: Normally, cert file is located on ~/.docker/machine/machines/swarm-1
Check out in which directory is getting configuration source

Currently, in the file which creates the services, the paths configuration source are harcoded.
This file is in insikt-deploy-entrega/insikt-swarm/ci/docker-staging.yml
A good example would be: configs:
frontendconfig:
file: /home/ubuntu/tmp/insikt-deploy-entrega/insikt-swarm/config/nginx/nginx.conf kibanaconfig: file: /home/ubuntu/tmp/insikt-deploy-entrega/insikt-swarm/config/kibana/kibana.yml elasticconfig: file: /home/ubuntu/tmp/insikt-deploy-entrega/insikt-swarm/config/elasticsearch/elasticsearch.yml logstashconfig: file: /home/ubuntu/tmp/insikt-deploy-entrega/insikt-swarm/config/logstash/logstash.conf

In this example, the /home/ubuntu/tmp path should be replaced for the good one.

 

Once all thiese prerequisites are done, we can deploy the services.
In the terminal execute:

eval $(docker-machine env swarm-1)
bash swarm.sh

 

If all went well, after execute it, the ending of the output in the terminal should be some thing like:

staging_elasticsearch
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
verify: Service converged
staging_database
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
verify: Service converged

Expose Swarm nodes

 

Using iptables

To expose the node to the outside, we need to forward the requests. This should be possible using iptables.
The file port_forward.sh in insikit-deploy-entrea/ adds the rules to iptables.

Note: Before use it, check that the variable wan_addr is your server WAN IP

How it works:

bash port_forward.sh IP_CONTAINER PORT_CONTAINER

 

Ex: bash port_forward.sh 192.168.99.100 9090

This command will forward the request from your_wan_ip_server:9090 to 192.168.99.100:9090 If does not work If does not work you need to check many things: iptables port forward option is enabled

 

Check if the option is allowed with sysctl net.ipv4.ip_forward Execute: sysctl net.ipv4.ip_forward Expected output: net.ipv4.ip_forward = 1

If is 0 you must need to execute: sysctl -w net.ipv4.ip_forward=1 printf “net.ipv4.ip_forward = 1” >> /etc/sysctl.conf

 

There is another firewall In Ubuntu servers must be probable that ufw is enabled. Check it with: Execute: ufw status Expected output: Status: inactive

 

If is active, execute ufw disable and restart the server.

 

Python alerts db We go to create the database for alerts in python. First we need to copy the repository git clone https://github.com/InsiktIntelligence/insikt-backend-pyalerts.git

To continue, we need to follow the next steps: Configuration values are retrieved from this config file. Application reads that config file from the path stored within environment variable INSIKT_APP_SETTINGS. If that variable is not set then relative path ../config/default.app.cfg is used. In order to provide your own values for config properties listed within default ./config/default.app.cfg file, make a copy of that file and set path to this copy as value for INSIKT_APP_SETTINGS environment variable. For example, you can create local copy of default config:

cd insikt-backend-pyalerts cp ./config/default.app.cfg ./config/be-local-dev-env.app.cfg export INSIKT_APP_SETTINGS=../config/be-local-dev-env.app.cfg
mysql -h 192.168.99.100 -u test -p insikt < mysql-tested.sql

 

This will instruct the Flask to read configs from your local copy of default config file. Note, your local copy will not be recognized by git, as there is a gitignore configuration to omit all copies of default config gile, except the default config itself. So, if you need to change some default configuration parameters you will need to edit ./config/default.app.cfg file and commit this changes. You also need to create a table language and network_analysis. CREATE TABLE language (id INT auto_increment PRIMARY KEY, name text, status tinyint(1)); CREATE TABLE network_analysis (id INT auto_increment PRIMARY KEY,project_id VARCHAR(256) DEFAULT NULL, start date, end date, source varchar(1000), status tinyint(4));

 

Keystore db Also is needed restore the keystore db. We need in our local restore it from a dump with

pg_restore -h 95.216.97.242 -U postgres -W insikt1.sql -d insikt

 

Note: Database must be already created

Create Elasticsearch demo index We need to expose Kibana with port_foward.sh file bash port_foward.sh 192.168.99.100 5601

After this we need to go to Kibana though browser in IP_SERVER:5601 Once Kibana is loaded go to the sidebar menu and click on Dev Tools. You will see kibana console Image

You need to copy this code to create the demo index.

PUT demo {
“settings”: { “number_of_shards”: 6, “number_of_replicas”: 1, “analysis”: { “analyzer”: { “default”: { “type”: “standard”, “tokenizer”: “lowercase”, “filter”: [ “asciifolding” ] } } }, “index.requests.cache.enable”: true }, “mappings”: { “tweet”: { “_source”: { “enabled”: true },

“properties”: { “analysis”: { “properties”: { “threatScore”: { “type”: “long”, “doc_values”: true }, “concepts”: { “properties”: { “concept”: { “type”: “keyword”, “index”: true, “doc_values”: true } } }, “docSentiment”: { “type”: “double”, “index”: true, “doc_values”: true },

“emotions”: { “properties”: { “emotion”: { “type”: “keyword”, “index”: true, “doc_values”: true } } }, “entities”: { “type”: “nested”, “properties”: { “entity”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “entityType”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “type”: { “type”: “keyword”, “index”: true, “doc_values”: true }

} }, “hashtags”: { “properties”: { “text”: { “type”: “keyword”, “index”: true, “doc_values”: true } } }, “keyIdeas”: { “properties”: { “keyIdea”: { “type”: “keyword”, “index”: true, “doc_values”: true } } },

 

“screenName”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “topics”: { “type”: “nested”, “properties”: { “topic”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “category”: { “type”: “keyword”, “index”: true, “doc_values”: true } } } } },

“createdAt”: { “type”: “date”, “index”: true, “doc_values”: true, “format”: “dateOptionalTime” }, “detectedLang”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “geoLocation”: { “properties”: { “latitude”: { “type”: “double”, “index”: true, “doc_values”: true }, “longitude”: { “type”: “double”, “index”: true, “doc_values”: true } } }, “coordinates”: {

“index”: true, “type”: “geo_point” }, “geoname”: { “properties”: { “countryCode”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “geonameid”: {
“type”: “integer”, “index”: true, “doc_values”: true }, “name”: {
“type”: “keyword”, “index”: true, “doc_values”: true } } },

“hashtagEntities”: { “properties”: { “end”: { “type”: “long”, “doc_values”: true }, “start”: { “type”: “long”, “doc_values”: true }, “text”: { “type”: “keyword”, “index”: false, “doc_values”: true } } },

“id”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “idLong”: { “type”: “long”, “doc_values”: true }, “mediaEntities”: { “properties”: { “end”: { “type”: “long”, “doc_values”: true }, “mediaURL”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “mediaURLHttps”: { “type”: “keyword”, “index”: false, “doc_values”: true },

“start”: { “type”: “long”, “doc_values”: true } } }, “place”: { “properties”: { “boundingBoxCoordinates”: { “properties”: { “latitude”: { “type”: “double”, “index”: true, “doc_values”: true }, “longitude”: { “type”: “double”, “index”: true, “doc_values”: true } } },

“boundingBoxType”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “country”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “countryCode”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “fullName”: { “type”: “keyword”, “index”: true, “doc_values”: true },

“id”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “name”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “placeType”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “url”: { “type”: “keyword”, “index”: false, “doc_values”: true } } },

“retweetedStatus”: { “properties”: { “createdAt”: { “type”: “date”,
“index”: true, “doc_values”: true, “format”: “dateOptionalTime”
}, “geoLocation”: { “properties”: { “latitude”: {
“type”: “double”, “doc_values”: true },
“longitude”: { “type”: “double”, “doc_values”: true } } },

“hashtagEntities”: { “properties”: { “end”: { “type”: “long”, “doc_values”: true }, “start”: { “type”: “long”, “doc_values”: true }, “text”: { “type”: “keyword”, “index”: true, “doc_values”: true } } }, “id”: { “type”: “keyword”, “index”: true, “doc_values”: true },

“mediaEntities”: { “properties”: { “end”: { “type”: “long”, “doc_values”: true }, “mediaURL”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “mediaURLHttps”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “start”: { “type”: “long”, “doc_values”: true } } },

“place”: { “properties”: { “boundingBoxCoordinates”: { “properties”: { “latitude”: { “type”: “double”, “doc_values”: true }, “longitude”: { “type”: “double”, “doc_values”: true } } }, “boundingBoxType”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “country”: { “type”: “keyword”, “index”: true, “doc_values”: true },

 

“countryCode”: { “type”:”keyword”, “index”: true, “doc_values”: true }, “fullName”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “id”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “name”: { “type”: “keyword”, “index”: true, “doc_values”: true },

“placeType”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “url”: { “type”: “keyword”, “index”: false, “doc_values”: true } } }, “source”: { “type”: “keyword”, “index”: true }, “symbolEntities”: { “properties”: { “end”: { “type”: “long”, “doc_values”: true },

“start”: { “type”: “long”, “doc_values”: true }, “text”: { “type”: “keyword”, “index”: true } } }, “text”: { “type”: “keyword”, “index”: true }, “urlEntities”: { “properties”: { “displayURL”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “end”: { “type”: “long”, “doc_values”: true },

“expandedURL”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “start”: { “type”: “long”, “doc_values”: true }, “url”: { “type”: “keyword”, “index”: false, “doc_values”: true } } },

“user”: { “properties”: { “createdAt”: { “type”: “date”, “doc_values”: true, “format”: “dateOptionalTime” }, “description”: { “type”: “keyword”, “index”: true }, “favouritesCount”: { “type”: “long”, “index”: true, “doc_values”: true }, “followersCount”: { “type”: “long”, “index”: true, “doc_values”: true },

“friendsCount”: { “type”: “long”, “index”: true, “doc_values”: true }, “id”: {
“type”: “keyword”,
“index”: true,
“doc_values”: true
},
“lang”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “location”: { “type”: “keyword”, “index”: true, “doc_values”: true },

“name”: { “type”: “keyword”, “index”: true, “fields”: { “raw”: { “type”: “keyword”, “index”: true, “doc_values”: true } } }, “profileImageUrl”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “screenName”: { “type”: “keyword”, “index”: true, “doc_values”: true },

“statusesCount”: { “type”: “long”, “index”: true, “doc_values”: true }, “url”: { “type”: “keyword”, “index”: false, “doc_values”: true } } }, “userMentionEntities”: { “properties”: { “end”: { “type”: “long”, “doc_values”: true }, “id”: { “type”: “long”, “doc_values”: true },

“name”: { “type”: “keyword”, “index”: true }, “screenName”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “start”: { “type”: “long”, “doc_values”: true } } } } }, “savedAt”: { “type”: “date”, “doc_values”: true, “format”: “dateOptionalTime” },

“source”: { “type”: “keyword”, “index”: true }, “symbolEntities”: { “properties”: { “end”: { “type”: “long”, “doc_values”: true }, “start”: { “type”: “long”, “doc_values”: true }, “text”: { “type”: “keyword” } } }, “text”: { “type”: “keyword”, “index”: true },

“unifiedText”: { “type”: “text”, “index”: true }, “unifiedUrls”: { “type”: “keyword” }, “urlEntities”: { “properties”: { “displayURL”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “end”: { “type”: “long”, “doc_values”: true }, “expandedURL”: { “type”: “keyword”, “index”: false, “doc_values”: true },

“start”: { “type”: “long”, “doc_values”: true }, “url”: { “type”: “keyword”, “index”: false, “doc_values”: true } } }, “user”: { “properties”: { “createdAt”: { “type”: “date”, “doc_values”: true, “format”: “dateOptionalTime” }, “description”: { “type”: “keyword”, “index”: true },

“favouritesCount”: { “type”: “long”, “doc_values”: true }, “followersCount”: { “type”: “long”, “doc_values”: true }, “friendsCount”: { “type”: “long”, “doc_values”: true }, “id”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “lang”: { “type”: “keyword”, “index”: true, “doc_values”: true },

“location”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “name”: { “type”: “keyword”, “index”: true, “fields”: { “raw”: { “type”: “keyword”, “index”: false, “doc_values”: true } } }, “profileImageUrl”: { “type”: “keyword”, “index”: false, “doc_values”: true },

“screenName”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “statusesCount”: { “type”: “long”, “doc_values”: true }, “url”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “urlEntity”: { “properties”: { “displayURL”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “end”: { “type”: “long”, “doc_values”: true },

“expandedURL”: { “type”: “keyword”, “index”: false, “doc_values”: true }, “start”: { “type”: “long”, “doc_values”: true }, “url”: { “type”: “keyword”, “index”: false, “doc_values”: true } } } } },

“userMentionEntities”: { “properties”: { “end”: { “type”: “long” }, “id”: { “type”: “long” }, “name”: { “type”: “keyword” }, “screenName”: { “type”: “keyword”, “index”: true, “doc_values”: true }, “start”: { “type”: “long” } } } } } }
}

Create a new project Once ES index is created is time to create a Inviso project.

Expose the frontend First we need to expose the Inviso frontend to login in the app.
bash port_forward.sh 192.168.99.100 443

 

And go to the next url: https://SERVER_IP
You will see the login screen Image Login with: User: parronator Password: 123456Aa

Once you are logged, follow the next steps to create a project Click on the button Add a new project Key project Fill the options project as you prefer, but twitter must be selected from Sources option Click on Create button

You will see a green toast pointing that project was created and you will redirect to projects dashboard with the new project created Image

Load test data

In insikt-deploy-entrega/ directory, there is a fille named test_smokees.py. You just need to execute it with:

python test_smokees.py

 

Note: If you get a “Magic numbers” error, you must delete the file with .pyc extension in the same directory as test_smokees.py file. You can use find . -name \*.pyc -delete Also inside test_smokees.py, delete: from insiktes import search, search2, search4, search_top, \ mysql_elk, get_id_list, get_source_list, get_top_list, notification To check if it works you just need to go to Dev Tools in Kibana and execute: GET demo/_search { “query”: { “match_all”: {} }
}

 

If worked, you must get results at in total of hits with at least 32 Image 0 comments

 

 

 

 

Minecraft Edu © 2025