Sunday, October 18, 2015

Software Quality Award comment for Checkstyle

ATTENTION: non of following should be treated as appellation or demand to reconsider review of Checkstyle project in Award. This post is targeted only to contributors who asking why Checkstyle missed so obvious points.

http://www.yegor256.com/award.html

> 83K Java LoC, 553K HoC

suseika/inflectible (5K LoC, 36K HoC) - winner
testinfected/molecule     (10K LoC, 43K HoC)
coala-analyzer/coala      (14K LoC, 160K HoC)
xvik/guice-persist-orient (17K LoC, 54K HoC)
raphw/byte-buddy      (84K LoC, 503K HoC)
citiususc/hipster   (5K LoC, 64K HoC)

checkstyle/checkstyle (83K LoC, 553K HoC)
kaitoy/pcap4j         (42K LoC, 122K HoC)




we are second biggest project in Final. But we are the oldest projects, we are almost 15 years old. That point will explain some design problems below.

> There are many ER-ending classes, like SeverityLevelCounter, Filter, and AbstractLoader (for example), which are anti-patterns.

That is controversial anti-pattern and I will explain why there is no damage this in separate post. No comments for this point, it is just philosophy of Award owner. 

> There is a whole bunch of utility classes, which are definitely a bad thing in OOP. They are even groupped into a special utils package, such a terrible idea.

Utility is good as it is stateless realization of algorithms. Mismatch of design philosophy between Author and us.

> Setters and getters are everywhere, together with immutable classes, which really are not an OOP thing, for example DetectorOptions.

Yes, that is effect of 15 years old project and numerous existing Checkstyle's integration and extensions that already exists in this world. We can not changes this - it will be huge compatibility damage. Nobody will benefit from such update. But we will already in progress to make Checkstyle more immutable.
Attention: getter/setter is very controversial anti-pattern (mutability is a concern, but there is not problem in naming), I will explain this in separate post.

> NULL is actively used, in many places — it's a serious anti-pattern

Yes, that problems comes from core Checkstyle - ANTLR2 parser.

> I've found five .java files with over 1000 lines in each of them, for example 2500+ in ParseTreeBuilder.java

Yes, and we do this on good reason. First of all https://github.com/checkstyle/checkstyle/blob/master/src/test/java/com/puppycrawl/tools/checkstyle/grammars/javadoc/ParseTreeBuilder.java is in test ares, secondly it is generated :).
There are exceptions in rule for big files:
2) UTs are also in exclusion, as it is easier to find from class MyCheck.java --> MyCheckTest.java and not puzzle what  MyCheckSecondTest.java  mean (and how author decide when test should go second test file). So it is OK.

> There are direct commits to master made by different contributors and some of them are not linked back to any tickets. It's impossible to understand why they were made. Look at this for example: 7c50922. Was there a discussion involved? Who made a decision? Not clear at all.

Such commit is mine :). But we do have strict control of commit message, we are even more fanatic than all other projects. Whole idea is described - https://github.com/checkstyle/checkstyle/wiki/Release-notes-automation . There is no problem with direct commits, especially from the most experienced owners of it (see all details in wiki link).

Releases are not documented at all.

It is first time I see that somebody pay attention to https://github.com/checkstyle/checkstyle/releases , we do have Release Notes in human friendly way on our main HTML site - http://checkstyle.sourceforge.net/releasenotes.html  (it is not a list of commits !!!! users do not need commits!!! )

> Release procedure is not automated. At least I didn't find any release script in the repository.

It is automated and done by standard maven procedure "mvn release" no need for special shell file and we are multi-OS project (Linux, Windows, MacOS). 
Release is not a binaries copy to artifact repository!  Here is detailed instructions on how to make a release - https://github.com/checkstyle/checkstyle/wiki/How-to-make-a-release with all details of how systems should be prepared and what should be updated after version bump. But I agree, I would be happy to automate it. For now it does not make sense to spend time on difficult automation, if we release ones a month.


Summary:

some quotes from Award page:
I'm a big fan of object-oriented programming in its purest form
....
Strict and visible principles of design.

We participated in Pure OOP contest. I do not share fanatic following of pure OOP designs, I am more in favor of changes to be more functional. Any re-factoring in favor of OOP is not possible without braking compatibility with plugins/extensions.
So 7th place is very good.

Friday, October 16, 2015

Code coverage could help to detect dead or extra code

Code coverage could for help detect dead code or useless conditions, see example at
https://github.com/sevntu-checkstyle/sevntu.checkstyle/pull/324

No way to reproduce in UT it mean that code is dead or useless.

Interesting automation for dead code removal base on code coverage (Guantamo project) - http://docs.codehaus.org/display/ASH/Guantanamo 

How to cache maven local repo between Travis builds

caching is allowed only for private repos or public repos that use "Using container-based infrastructure".

Steps:
1. instruct travis to use "container-based infrastructure"
2. instruct travis what folder you need to cache

Official documentation:
http://docs.travis-ci.com/user/workers/container-based-infrastructure/#Routing-your-build-to-container-based-infrastructure
http://docs.travis-ci.com/user/caching/#Arbitrary-directories


Results: that caching speedup build on ~3-4 minutes for each jdk.

Example: https://github.com/checkstyle/checkstyle/blob/master/.travis.yml

Oracle jobs creation and logging activation

create job , configure logging and enable it:

BEGIN
DBMS_SCHEDULER.CREATE_JOB (
   job_name           =>  'TEST_SCHEDULED_JOB',
   job_type           =>  'PLSQL_BLOCK',
   job_action         =>  'BEGIN select * from  myschema.mytable; END;',
   start_date         =>  TO_DATE('14/01/2015,12:00 AM', 'DD/MM/YYYY,HH:MI AM'),
   repeat_interval    =>  'FREQ=HOURLY; INTERVAL=4',
   end_date           =>  null,
   job_class          =>  'DEFAULT_JOB_CLASS',
   comments           =>  'detection and notification of ORA- errors in log files.  Notify admins');
END;
/

BEGIN
DBMS_SCHEDULER.SET_ATTRIBUTE ('TEST_SCHEDULED_JOB', 'logging_level', DBMS_SCHEDULER.LOGGING_FULL);
END;
/

BEGIN
DBMS_SCHEDULER.ENABLE('TEST_SCHEDULED_JOB');
END;
/

Check that job scheduled correctly:

select last_start_date, next_run_date 
from DBA_SCHEDULER_JOBS
where job_name = 'BOD_CHANGE_SYSDATE' ; 


after testing drop a job :

begin
 DBMS_SCHEDULER.drop_job (job_name => 'TEST_SCHEDULED_JOB');
end;
/

Filter logback events by time


Example:

<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<target>System.out</target>
<encoder>
<pattern>${common.log.conversionpattern}</pattern>
</encoder>
<!-- perform more precise filtering, you can suppress particular messages here -->
<filter class="ch.qos.logback.core.filter.EvaluatorFilter">
<evaluator>
<expression><![CDATA[

     java.util.Calendar cal = java.util.Calendar.getInstance(java.util.TimeZone.getDefault());
     cal.setTimeInMillis(timeStamp);
     int hour = cal.get(java.util.Calendar.HOUR_OF_DAY); //HOUR_OF_DAY is required for 24h format
   
if (hour > 14) {
return false;
}

return true;
]]></expression>
</evaluator>
<OnMatch>NEUTRAL</OnMatch> <!-- we may want to add more filters later -->
<OnMismatch>DENY</OnMismatch>
</filter>

</appender>

Oracle SQL to show all columns for all tables in particular schema

Oracle SQL to show all columns for all tables in particular schema:

SELECT   table_name, SUBSTR (MAX (all_columns), 2) all_columns
       FROM (SELECT     table_name,
                        SYS_CONNECT_BY_PATH (column_name, ',') all_columns
                   FROM (SELECT table_name, column_name,
                                ROW_NUMBER () OVER (PARTITION BY table_name ORDER BY column_id)
                                                                    column_no
                           FROM all_tab_columns c
                          WHERE c.owner = 'YOUR_SCHEMA'
                            --AND column_name NOT IN ('OLD_', 'DEP_')
                            )
             CONNECT BY PRIOR table_name = table_name
                    AND PRIOR column_no = column_no - 1
             START WITH column_no = 1)
   GROUP BY table_name;


output is like:
TABLE_NAME | ALL_COLUMNS
------------------------------------------------- 
FLAGS            | ID,VALUE,LASTUPD
MESSAGES    | DB_TIMESTAMP,PRIORITY,MESSAGE,DATASETS,IDS,CODE

Salted Password Hashing

Custom PMD Rule creation

based on PMD 5.4, following XML should be just copied to pmd configuration file

This rule is extension to ShortVariable with update to skip validation of methods with Override annotation

Rule:
<rule name="CustomShortVariable"
message="Avoid variables with short names that shorter than 2 symbols: {0}"
language="java"
class="net.sourceforge.pmd.lang.rule.XPathRule"
externalInfoUrl="">
<description>
Fields, local variables, or parameter names that are very short are not helpful to the reader.
</description>
<priority>3</priority>
<properties>
<property name="xpath">
<value>
<![CDATA[
//VariableDeclaratorId[string-length(@Image) < 2]
[not(ancestor::ForInit)]
[not(../../VariableDeclarator and ../../../LocalVariableDeclaration and ../../../../ForStatement)]
[not((ancestor::FormalParameter) and (ancestor::TryStatement))]
[not(ancestor::ClassOrInterfaceDeclaration[//MarkerAnnotation/Name[pmd-java:typeof(@Image, 'java.lang.Override', 'Override')]])]
]]>
</value>
</property>
</properties>
</rule>

Friday, October 2, 2015

Execution of command right after process is finished in bash

Task : you have one java process running and it run for long time and you need to execute smth right after it is finished.

First of all, get details of running job:

[user@myserver ~]$ jps -vm
32632 Jps -vm -Dapplication.home=/opt/jdk1.8.0_45 -Xms8m
16364 Shell --home /var/jenkins/jobs/validator --processor com.mycompany.Validator -Xmx1024m 

[user@myserver ~]$ jps -vm | grep Validator
16364 Shell --home /var/jenkins/jobs/validator --processor com.mycompany.Validator -Xmx1024m 

Now you want to get time when it was finished:
[user@myserver ~]$ 
while [[ `jps -vm | grep Validator | wc -l` != 0 ]]; do sleep 60; done; echo "finished at `date`"

Instead of "echo "finished at `date`"" you can put any other command, 60 seconds waiting was ok for my task.

Instead of "jps -vm" you can use "ps -ax" or whatever you like to grab process id or process details.
On SunOS "ps -Af".

FYI, to do smth and send email:
JOB="my-job" && while [[ $(jps -vm | grep $JOB | wc -l) != 0 ]]; do sleep 60; done; echo "echo smth valuable" ; echo "" | mailx -s "mailx: $JOB is finished and do_smth_valuable is done" myemail@mycompany.com,myemail2@mycompany.com

.

Thursday, October 1, 2015

Checkstyle integration to codeclimate.com platformt

from Michael Bernstein:

We opened up the platform and invited developers to make "engines" that run on it - basically small wrappers around existing static analysis tools
checkstyle was one of the first that someone contributed
It's a small groovy script, basically - you can see it here https://github.com/sivakumar-kailasam/codeclimate-checkstyle
We think that this is a really awesome thing for people who work on tools like you do
Because it's a new way for people to use it - we run analysis for free for all OSS repos
cool, let me see
Basically you add your repo to codeclimate
And add a .codeclimate.yml to your repo
where I can reed more about your platform that allow to run custom analysers
Overview of the platform is here: codeclimate.com/platform
Blog post about how to build engines is here: blog.codeclimate.com/blog/2015/07/07/build-your-own-codeclimate-engine/
please share examples of .codeclimate.yml
Let me know if you get a chance to check it out