From Thursday, July, 9, 3:00 PM Pacific,
our Community site will be in
read-only mode
through Sunday, July 12th.
Thank you for your patience.

Log4j tips and tricks

Overview

Log4j, incorporated in Talend software, is an essential tool for discovering and solving problems. This article shows you some tips and tricks for using Log4j.

 

The examples in this article use Log4j v1, but Talend 7.3 uses Log4j v2. Although the syntax is different between the versions, anything you do in Log4j v1 should work, with some modification, in Log4j v2. For more information on Log4j v2, see Configuring Log4j, available in the Talend Help Center.

 

Configuring Log4j in Talend Studio

Configure the log4j.xml file in Talend Studio by navigating to File > Edit Project properties > Log4j.

 

Log4j configuration for Jobs in StudioLog4j configuration for Jobs in Studio

You can also configure Log4j using properties files or built-in classes; however, that is not covered in this article.

 

Emitting messages

You can execute code in a tJava component to create Log4j messages, as shown in the example below:

 

log.info("Hello World");
log.warn("HELLO WORLD!!!");

This code results in the following messages:

[INFO ]: myproject.myjob - Hello World
[WARN ]: myproject.myjob - HELLO WORLD!!!

 

Routines

You can use Log4j to emit messages by creating a logger class in a routine, as shown in the example below:

public class logSample {
/*Pick 1 that fits*/
	private static org.apache.log4j.Logger log = org.apache.log4j.Logger.getLogger(logSample.class);
	private static org.apache.log4j.Logger log1 = org.apache.log4j.Logger.getLogger("from_routine_logSample");
 /*...*/
   public static void helloExample(String message) {
        if (message == null) {
            message = "World"; 
        }
        log.info("Hello " + message + " !");
        log1.info("Hello " + message + " !");
    }
}

To call this routine from Talend, use the following command in a tJava component:

logSample.helloExample("Talend");

The log results will look like this:

[INFO ]: routines.logSample - Hello Talend !
[INFO ]: from_routine_logSample - Hello Talend !

Using <routineName>.class includes the class name in the log results. Using free text with the logger includes the text itself in the log results. This is not really different than using System.out, but Log4j can be customized and fine-tuned.

 

Controlling Log4j message formats with patterns

You can use patterns to control the Log4j message format. Adding patterns to Appenders customizes their output. Patterns add extra information to the message itself. For example, when multiple threads are used, the default pattern doesn't provide information about the origin of the message. Use the %t variable to add a thread name to the logs. To easily identify new messages, it's helpful to use %d to add a timestamp to the log message.

 

To add thread names and timestamps, use the following pattern after the CONSOLE appender section in the Log4j template:

<param name="ConversionPattern"  		  		
       value= "%d{yyyy-MM-dd HH:mm:ss}  [%-5p] (%t): %c - %m%n" />

 

The pattern displays messages as follows:

ISO formatted date [log level] (thread name): class projectname.jobname - message contents

 

If the following Java code is executed in three parallel threads, using the sample pattern above helps distinguish between the threads.

 

java.util.Random rand = new java.util.Random();
log.info("Hello World");
Thread.sleep(rand.nextInt(1000));
log.warn("HELLO WORLD!!!");
logSample.helloExample("Talend");

Parallel callsParallel calls

This results in an output that shows which thread emitted the message and when:

2020-05-19 12:18:30  [INFO ] (tParallelize_1_e45bc79b-d61f-45a3-be8f-7089ab6d565d): myproject.myjob_0_1.myjob - Hello World
2020-05-19 12:18:30  [INFO ] (tParallelize_1_4064c9b8-0585-41e0-b9f0-95fb31e602b7): myproject.myjob_0_1.myjob - Hello World
2020-05-19 12:18:30  [INFO ] (tParallelize_1_a8ef1065-0106-4b45-8a60-d02a9cbe1f00): myproject.myjob_0_1.myjob - Hello World
2020-05-19 12:18:30  [WARN ] (tParallelize_1_e45bc79b-d61f-45a3-be8f-7089ab6d565d): myproject.myjob_0_1.myjob - HELLO WORLD!!!
2020-05-19 12:18:30  [INFO ] (tParallelize_1_e45bc79b-d61f-45a3-be8f-7089ab6d565d): routines.logSample - Hello Talend !
2020-05-19 12:18:30  [INFO ] (tParallelize_1_e45bc79b-d61f-45a3-be8f-7089ab6d565d): from_routine.logSample - Hello Talend !
2020-05-19 12:18:30  [WARN ] (tParallelize_1_a8ef1065-0106-4b45-8a60-d02a9cbe1f00): myproject.myjob_0_1.myjob - HELLO WORLD!!!
2020-05-19 12:18:30  [INFO ] (tParallelize_1_a8ef1065-0106-4b45-8a60-d02a9cbe1f00): routines.logSample - Hello Talend !
2020-05-19 12:18:30  [INFO ] (tParallelize_1_a8ef1065-0106-4b45-8a60-d02a9cbe1f00): from_routine.logSample - Hello Talend !
2020-05-19 12:18:31  [WARN ] (tParallelize_1_4064c9b8-0585-41e0-b9f0-95fb31e602b7): myproject.myjob_0_1.myjob - HELLO WORLD!!!
2020-05-19 12:18:31  [INFO ] (tParallelize_1_4064c9b8-0585-41e0-b9f0-95fb31e602b7): routines.logSample - Hello Talend !
2020-05-19 12:18:31  [INFO ] (tParallelize_1_4064c9b8-0585-41e0-b9f0-95fb31e602b7): from_routine.logSample - Hello Talend !

 

If you want to know which component belongs to which thread, you need to change the log level to add more information.

You can do this in Studio on the Run tab, in the Advanced settings tab of the Job execution.

Configure Log4j level from StudioConfigure Log4j level from Studio

 

In Talend Administration Center, you do this in Job Conductor.

Overriding Log4j level from Talend Administration CenterOverriding Log4j level from Talend Administration Center

 

Using DEBUG level adds a few extra lines to the log file, which can help you understand which parameters resulted in a certain output:

2020-05-19 12:51:50  [DEBUG] (tParallelize_1_c6de81be-1bbf-4f9b-9b7a-3d92bf345c40): myproject.myjob_0_1.myjob - tParallelize_1 - The subjob starting with the component 'tJava_1' starts.
2020-05-19 12:51:50  [DEBUG] (tParallelize_1_fa636a36-9f53-423f-abc6-b26c4c52c5b4): myproject.myjob_0_1.myjob - tParallelize_1 - The subjob starting with the component 'tJava_3' starts.
2020-05-19 12:51:50  [DEBUG] (tParallelize_1_d4da8ea0-4401-4229-82e9-86ff0ed67c3b): myproject.myjob_0_1.myjob - tParallelize_1 - The subjob starting with the component 'tJava_2' starts.

 

Keep in mind the following:

  • Changing the default log pattern causes Studio to stop coloring the messages.
  • The default log level in Studio is defined by the root logger's priority value (Warn, by default).
  • Changing the log level changes the number of messages.
  • Changing the pattern changes the message format.

 

Logging levels

The following table describes the Log4j logging levels you can use in Talend applications:

Debug Level Description
TRACE Everything that is available is being emitted at this logging level, which makes every row behave like it has a tLogRow component attached. This can make the log file extremely large; however, it also displays the transformation done by each component.
DEBUG This logging level displays the component parameters, database connection information, queries executed, and provides information about which row is processed, but it does not capture the actual data.
INFO This logging level includes the Job start and finish times, and how many records were read and written.
WARN Talend components do not use this logging level.
ERROR This logging level writes exceptions. These exceptions do not necessarily cause the Job to halt.
FATAL When this appears, the Job execution is halted.
OFF Nothing is emitted.

 

These levels offer high-level controls for messages. And when changed from outside affects only the Appenders that didn't specify a log level, and rely on the level set by the root logger.

 

Using Appenders

Log4j messages are processed by Appenders, which route the messages to different outputs, such as to console, files, or logstash. Appenders can even send messages to databases, but for database logs, the built-in Stats & Logs might be a better solution.

 

Storing Log4j messages in files can be useful when working with standalone Jobs. Here is an example of a file Appender:

<appender name="ROLLINGFILE" class="org.apache.log4j.RollingFileAppender">
  <param name="file" value="rolling_error.log"/>
  <param name="Threshold" value="ERROR"/
  <param name="MaxFileSize" value="10000KB"/>
  <param name="MaxBackupIndex" value="5"/>
  <layout class="org.apache.log4j.PatternLayout">
    <param name="ConversionPattern" 
           value="%d{yyyy-MM-dd HH:mm:ss}  [%-5p] (%t): %c - %m%n"/>
  </layout>
</appender>

You can use multiple Appenders to have multiple files with different log levels and formats. Use the parameters to control the content. The Threshold value of ERROR doesn't provide information about the Job execution, but a value of INFO makes errors harder to detect.

 

For more information on Appenders, see the Apache Interface Appender page.

 

Using filters

You can use filters with Appenders to keep messages that are not of interest out of the logs. Log4j v2 offers regular expression based filters too.

 

The following example filter omits any Log4j messages that contain the string " - Adding the record ".

<filter class="org.apache.log4j.varia.StringMatchFilter">
	<param name="StringToMatch" value=" - Adding the record " />
	<param name="AcceptOnMatch" value="false" />
</filter>

 

Overriding default settings in Talend Administration Center

When a Java program starts, it attempts to load its Log4j settings from the log4j.xml file. You can modify this file to change the default settings, or you can force Java to use a different file. For example, you can do this for Jobs deployed to Talend Administration Center by configuring the JVM parameters. This way, you can change the logging behavior for a Job without modifying the original Job, or you can revert back to the original logging behavior by clearing the Active check box.

Override Log4j using JVM parametersOverride Log4j using JVM parameters

Version history
Revision #:
19 of 19
Last update:
‎06-11-2020 02:50 PM
Updated by: