Have a question if anybody have experience:
currently I use with Talend runtime - Karaf web-console + hawtio plugin
Routes and Jobs send alerts when some errors happens, but because not all errors could be handled inside route it self - for example route not start by connection timeout or other reasons, I use live pings algorithm as addition layer,
this combination cover most of problem, but I looking for additional ways.
after research and test I found - Apache Decanter could be solution, but:
- default repository not include sla-email alerter
- current version of decanter appenders not compatible with external ELK 5.* (it not a problem if not previous point)
- new versions of decanter (1.3.0) not compatible (or I do not all properly) with Talend version of Karaf - 4.0.7
Would be interesting to share with community:
- is anybody have success experience with installing decanter with sla-alert module?
- who and what use for monitoring and alarming/alerting? (alarms and alerts more interesting than monitoring)
Best regards, Vlad
We have redirected your issue to talend ESB experts and then come back to you as soon as we can.
Thanks for your time.
some details about Decanter and your question:
- the Decanter features repo includes decanter-sla-email feature. You have to install this one.
- if you use the decanter-appender-elasticsearch-rest, it can work and send the data to an external Elasticsearch instance, including version 5.x.
- Decanter is compatible with any Karaf version, including the one in Talend. What's the issue ?
thank You for response. I not great expert in this question, but how it look from my side:
Talend 6.3.1 ESB (Community)
karaf@trun()> feature:list | grep decanter decanter-common | 3.0.0.SNAPSHOT | | Uninstalled | karaf-decanter-3.0.0-SNAPSHOT | Karaf Decanter API decanter-simple-scheduler | 3.0.0.SNAPSHOT | | Uninstalled | karaf-decanter-3.0.0-SNAPSHOT | Karaf Decanter Simple Scheduler decanter-collector-log | 3.0.0.SNAPSHOT | | Uninstalled | karaf-decanter-3.0.0-SNAPSHOT | Karaf Decanter Log Messages Collector decanter-collector-jmx | 3.0.0.SNAPSHOT | | Uninstalled | karaf-decanter-3.0.0-SNAPSHOT | Karaf Decanter JMX Collector decanter-appender-log | 3.0.0.SNAPSHOT | | Uninstalled | karaf-decanter-3.0.0-SNAPSHOT | Karaf Decanter Log Appender decanter-appender-elasticsearch | 3.0.0.SNAPSHOT | | Uninstalled | karaf-decanter-3.0.0-SNAPSHOT | Karaf Decanter Elasticsearch Appender elasticsearch | 1.3.4 | | Uninstalled | karaf-decanter-3.0.0-SNAPSHOT | Embedded Elasticsearch node kibana | 3.1.1 | | Uninstalled | karaf-decanter-3.0.0-SNAPSHOT | Embedded Kibana dashboard karaf@trun()>
as You can see - no sla modules
if we change command:
feature:repo-add decanter 1.3.0
it will be much more features ... but
after this - any attempt to instal kibana or some other modules from repository (not all but few important) - freeze and not response after restart, still not work. Tested on 3 different machines, Mac and Win.
internal Kibana not start,
If install only collector and appenders, REST appender - do not connect to external Elasticsearch (error message - client must be at least version 2.0)
Finally I downgrade external ELK (it is working solution, but not ideal because other services use 5.*)
after this install logs collector and appender and email-sla from 1.3.0 repository, both appenders start work, as well as log4j from Studio start log to logstash
now play around email-sla and ELK watcher, it new for me, so this part in progress.
main idea of all this games, have alarms for some errors from routes which we do not have now.
In DI part I already use - email, slack, snmp, OpsGenie alarms (different levels), in routes for internal errors - also all work fine
But when route not start because external server not answer - I not found method to catch this error from route. It die and not send any notice, only error in log
it was asked before - https://www.talendforge.org/forum/viewtopic.php?id=56642
with monitoring part - all fine, but after couple years with good alarms from DI Jobs, was surprised when can not realise same in routes and try to resolve
Best regards, Vlad