Sunday, July 31, 2016

Source code exploration using regular expressions


Once I faced a situation when I need to remove hard coded values from a huge program. There was no way to do it manually. There was really enough various combinations of these numbers. And to use "Find" from SE80 for each one was simply a nightmare. I came to a conclusion to create a small tool reading source code of any program with all its includes with ability to search them by regex. And here it comes.

Usage example


To use the tool you have to understand regular expressions a bit. The tool would be as much powerful as are your regular expressions skills.


Output could look like e.g.


Regex examples for source code exploration tool

Below you can find input regular expressions to reach in source code:

Any 4 digit number
^.+(\d{4}).+$

Any 4 digit number within apostrophes
^.+(')(\d{4})(').+$
  
Any 4 digit number within apostrophes but not in commented rows
^[^\*].+(')(\d{4})(').+$

Any 4 digit number within apostrophes but in commented rows only
^[*].+(')(\d{4})(').+$

Any word (not a number) within apostrophes – e.g. some non numeric literals
^[^\*].+(')(\w+)(').+$

Any 4 digit number 
1. within apostrophes 
2. not in commented rows 
3. avoiding words “bildschirm” or “screen” or “dynnr” preceding the digit
^[^\*]((?!bildschirm|screen|dynnr).)*(')(\d{4})(').+$
  
A four digit number 
starting with “09”
and continues with any two digits number
it includes sub strings
^.+(09)(\d{2}).+$

Search for usernames 
starting with “IT”
and continues with any five digits number
note: it can discover hardcoded behavior for certain users
^.+(IT)(\d{5}).+$

  

Source code 

*&---------------------------------------------------------------------*
*& Report  Y_REPORT_REGEX_SEARCH
*&
*&---------------------------------------------------------------------*
*&
*& Search source code by regex
*&---------------------------------------------------------------------*

REPORT y_report_regex_search.

TABLEStstctrdir.

*******************************************************************
*   SCREENS - choose transaction or program
*******************************************************************
SELECTION-SCREEN BEGIN OF BLOCK frame1 WITH FRAME TITLE text-001.
" transaction
SELECTION-SCREEN BEGIN OF LINE.
PARAMETERSp_trans  RADIOBUTTON GROUP grp1 USER-COMMAND flagcommand" DEFAULT 'X'.
SELECTION-SCREEN COMMENT 3(60text-002" Transaction
SELECTION-SCREEN END OF LINE.
" program
SELECTION-SCREEN BEGIN OF LINE.
PARAMETERSp_prog  RADIOBUTTON GROUP grp1 DEFAULT 'X'.
SELECTION-SCREEN COMMENT 3(60text-003" Program
SELECTION-SCREEN END OF LINE.
SELECTION-SCREEN SKIP.
SELECT-OPTIONS so_tcode FOR tstc-tcode NO INTERVALS" Transaction Code
SELECT-OPTIONS so_prog FOR trdir-name NO INTERVALS.  " Program name

SELECTION-SCREEN END OF BLOCK frame1.


SELECTION-SCREEN BEGIN OF BLOCK frame2 WITH FRAME TITLE text-004.
PARAMETERS p_regex TYPE LENGTH 100 LOWER CASE" Regular expression
SELECTION-SCREEN END OF BLOCK frame2.

*******************************************************************
AT SELECTION-SCREEN OUTPUT " PBO action
*******************************************************************
  " Set parameter according to radio
  LOOP AT SCREEN.
    IF p_prog <> 'X'.
      IF screen-name CS 'so_prog'.
        screen-active 0.
        MODIFY SCREEN.
        CONTINUE.
      ENDIF.
    ELSEIF p_trans <> 'X'.
      IF screen-name CS 'so_tcode'.
        screen-active 0.
        MODIFY SCREEN.
        CONTINUE.
      ENDIF.
    ENDIF.
  ENDLOOP.


*******************************************************************
START-OF-SELECTION.
*******************************************************************

* Get program name for transaction
  CONDENSE so_tcode-low NO-GAPS.
  IF p_trans EQ 'X'.
    CLEAR so_prog.
    SELECT SINGLE pgmna
      FROM tstc
      INTO so_prog-low
      WHERE tcode so_tcode-low.
    IF sy-subrc NE 0.
      WRITE/'Transaction does not exist : 'so_tcode-low.
      EXIT.
    ENDIF.
  ELSEIF p_prog EQ 'X'.
    SELECT SINGLE name
    FROM trdir
    INTO so_prog-low
    WHERE name so_prog-low.
    IF sy-subrc NE 0.
      WRITE/'Program does not exist : 'so_prog-low.
      EXIT.
    ENDIF.
  ENDIF.

  CONDENSE so_prog-low NO-GAPS.
  IF so_prog-low IS INITIAL.
    WRITE/'There is no program given.'.
    EXIT.
  ENDIF.

* Internal table holding include names
  TYPES:
  BEGIN OF ts_includes,
  prgname(40),
  END OF ts_includes.

  DATAlt_includes TYPE STANDARD TABLE OF ts_includes,
        ls_includes LIKE LINE OF lt_includes,
        lt_lines    TYPE TABLE OF string,
        ls_line     LIKE LINE OF lt_lines,
        lv_line_num        TYPE i,
        lv_lines_total_num TYPE i.

  DATAlc_matcher TYPE REF TO cl_abap_matcher,
        lv_match   TYPE LENGTH 1,
        lv_regex   TYPE string,
        lv_exception_detail TYPE string,
        oref   TYPE REF TO cx_root.

* Get include list
  CALL FUNCTION 'GET_INCLUDETAB'
    EXPORTING
      progname so_prog-low " '/FDSEU/CP01_OVERVIEW' 'SAPMM07M'
    TABLES
      incltab  lt_includes.

  MOVE p_regex TO lv_regex.

* Add body of main program into itab to be explored
  APPEND so_prog-low TO lt_includes.

  LOOP AT lt_includes INTO ls_includes .

    REFRESH lt_lines.
    READ REPORT ls_includes-prgname INTO lt_lines.

    CLEAR lv_line_num.

    LOOP AT lt_lines INTO ls_line.

      lv_line_num lv_line_num + 1.

      TRY.
          " Does string match the regex pattern?
          FREE lc_matcher.
          lc_matcher cl_abap_matcher=>create(
                      pattern     lv_regex
                      ignore_case abap_true
                      text        ls_line ).
          lv_match lc_matcher->match).
        CATCH cx_root INTO oref.
          lv_exception_detail oref->get_text).
          WRITE:'Regex error occured : 'lv_exception_detail.
          STOP.
      ENDTRY.
      IF lv_match EQ abap_true.
        WRITE:'INCLUDE: 'ls_includes-prgname'| LINE: 'lv_line_num'| CODE: 'ls_line.
        CLEAR lv_match.
      ENDIF.
    ENDLOOP" lines

    lv_lines_total_num lv_lines_total_num + lv_line_num.

  ENDLOOP" includes

  WRITE:/,'Total number of explored lines: 'lv_lines_total_num.



Regular expressions sources

Regular expressions are common technique used in many various programming languages. These are also included within ABAP.

Basic understanding
http://zevolving.com/2013/10/abap-regex-regular-expressions/

SAP/SDN comprehensive PDF document
http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/902ce392-dfce-2d10-4ba9-b4f777843182?overridelayout=true&49533857875589

https://help.sap.com/abapdocu_70/en/ABENREGEX_SYNTAX.htm

Regex toy – Regular expressions tester 

You can test your own patterns in standard SAP program the name “DEMO_REGEX_TOY”.

Sources: http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/1f9cd701-0b01-0010-87b8-f86f9b7b823f?QuickLink=index&overridelayout=true&5003637741589

For example pick up all lines not containing words “cute” or “fast”, do not care for any apostrophes.
Regex: ^((?!cute|fast).)*$




Other sources: