ysplit

Version 2

    Introduction

    Got a solr/hbase/yarnapp log of 10-20GB? Pulling your hairs off because your notepad++ can't open such huge files? Don't panic, this simple tool helps you in splitting the Yarn logs into multiple files (based on container / log types) and also builds an HTML index to quickly navigate through these files.

    Usecase / Scenario

    • Splitting the yarn log files in to files per
      • container
      • container + logtype

    Features

    1. Works in both Windows and Linux
    2. Builds a separate directory for each container logs
    3. Builds an HTML index for these files

    Usage / Example

    1. Download the source from:
      1. https://informatica-my.sharepoint.com/:f:/p/aganta/El9OuXFDQpJFmL5n5nT3EUwB-I7x0FNZKgZ44yum1kNWHw?e=2nkALc
    2. Extract the zip.
    3. Add the extract directory to the PATH.
    4. Run the command (this will split the file per container + log type basis)
      ysplit <file_name> true
    5. To split the log on just container basis, remove the "true".
    6. The output will be generated in the same directory where the input file is present.
      1. The name of the directory would be
        1. splitlogs_<fileName>

    Sample Run Output

    1. A demo run on solr logs

      cmdline output of splitting a file

      C:\Users\mpataki\Desktop\cases\arab_bank\cs_startup>ysplit application_1583853672476_0006.log trueC:\Users\mpataki\Desktop\cases\arab_bank\cs_startup>java -cpC:\Users\mpataki\Desktop\INFA\tools\bin\ ylogsplit.YLogSplit application_1583853672476_0006.log trueProcessed 1000 lines (114.905 Kbytes) (0.112 MBytes) (0 seconds)Processed 2000 lines (273.499 Kbytes) (0.267 MBytes) (0 seconds)Processed 3000 lines (463.347 Kbytes) (0.452 MBytes) (0 seconds)Processed 4000 lines (609.533 Kbytes) (0.595 MBytes) (0 seconds)Processed 5000 lines (735.794 Kbytes) (0.719 MBytes) (0 seconds)Processed 6000 lines (939.706 Kbytes) (0.918 MBytes) (0 seconds)Processed 7000 lines (1021.737 Kbytes) (0.998 MBytes) (0 seconds)Processed 8000 lines (1304.443 Kbytes) (1.274 MBytes) (0 seconds)Processed 9000 lines (1461.062 Kbytes) (1.427 MBytes) (0 seconds)Processed 10000 lines (1576.028 Kbytes) (1.539 MBytes) (0 seconds)Processed 11000 lines (1766.434 Kbytes) (1.725 MBytes) (0 seconds)Processed 12000 lines (1847.494 Kbytes) (1.804 MBytes) (0 seconds)Processed 13000 lines (2131.066 Kbytes) (2.081 MBytes) (0 seconds)Processed 14000 lines (2277.213 Kbytes) (2.224 MBytes) (0 seconds)Processed 15000 lines (2421.986 Kbytes) (2.365 MBytes) (0 seconds)Processed 16000 lines (2617.817 Kbytes) (2.556 MBytes) (0 seconds)Processed 17000 lines (2697.476 Kbytes) (2.634 MBytes) (0 seconds)Processed 18000 lines (2895.697 Kbytes) (2.828 MBytes) (1 seconds)Processed 19000 lines (3085.788 Kbytes) (3.013 MBytes) (1 seconds)Processed 20000 lines (3166.055 Kbytes) (3.092 MBytes) (1 seconds)Processed 21000 lines (3357.562 Kbytes) (3.279 MBytes) (1 seconds)Processed 22000 lines (3439.950 Kbytes) (3.359 MBytes) (1 seconds)Processed 23000 lines (3600.891 Kbytes) (3.516 MBytes) (1 seconds)Processed 24000 lines (3887.715 Kbytes) (3.797 MBytes) (1 seconds)Processed 25000 lines (4078.471 Kbytes) (3.983 MBytes) (1 seconds)

       

    2. The index

    Files (Runnable & Source Code)

    • ysplit.bat                              script for windows
    • ysplit.sh                               script for Linux
    • ylogsplit/YLogSplit.class       the splitting code (binary)
    • ylogsplit/YLogSplit.java        source cod

    System Requirement

    • Java 8