Tuesday, February 18, 2014

What I do not like about Scala

Here I will be focused on problems of Scala only, Scala have number of benefits too but that will be in other post. I look at Scala from Java experience.

Problems:

- Strict in type but loose in syntax ( meaning of syntax is unpredictable )

- Recursive functions could not be transformed by tail recursion, see blog of Rich Dougherty

- lists append , you need to remember to keep first list smaller then second to avoid performance problems

- scala is like a math formula, great when all is logical and all understand math easily, but real live is far from math.

- synax is unreadable sometime, it is like perl.

- syntax meaning is depends on linked libraries

- implicit converts from perspective of human investigation of code, so IDE have to be smart. Example Stting to StringOps.

-what is the reason to use in maps ++ instead of addAll.

-switching from val to var could make immutable set haver operators like +=.

- Great !!! Paul Phillips is a co-founder of Typesafe and the most prolific committer to Scala. His talk is about  Scala problems: http://www.slideshare.net/extempore/a-scala-corrections-library



to be continued ....

Monday, February 17, 2014

Hashcode and Equals method usages in Java


Known practice - do not override neither of equals(Object obj) and hashcode() or override them together.
But even after override all of them and implementing properly or even generating code for them by smart tool - there could be a problem.

Advise: All class implementations that override hashcode() and are going to be used in Hashed collections - have to be immutable.

Without following advice above you can end up in situation then you put object in HashSet, change object's field , try to find it in collection by contains() - and you will not find it in collection as hashcode is changed and bucket will be misleading as object is placed in old bucket but hashcode is calculated to point to another bucket (wikivisualization).

Advise: "hashcode()" is Java system related method - do not override it till you have no other choice !!!

Advise: If hashcode() is demand after already overridden "equals(Object obj)" change code to use Comparable or any other approach, it will not cost any time for additional codding but will safe you form problems in future when your systems will be huge.

Reference to other  tools:
FindBug:
HE: Class defines equals() but not hashCode()
HE: Class defines equals() and uses Object.hashCode()
HE: Class defines hashCode() but not equals()
HE: Class defines hashCode() and uses Object.equals()
HE: Class inherits equals() and uses Object.hashCode()

PMD:
http://pmd.sourceforge.net/pmd-4.3.0/rules/basic.html#OverrideBothEqualsAndHashcode

Checkstyle:
http://checkstyle.sourceforge.net/config_coding.html#EqualsHashCode

Summary:
All tool check for methods existence but non of them for immutable state.
Is it on purpose or it is non covered problem of Static Code Analyzers for Java ?

How quickly revert last pushed commit

I found that operation of giving permission for awhile in gitolite force me  do commit-push and revert of changes in few minutes , here is how to do revert quickly for last pushed commit (it hash is on output):

02:45 PM ~/java/git/gitolite-admin/conf [master|● 1] $ git commit -m "fix for hotfix branch"
[master 9d8f0ae] fix for hotfix branch
 1 file changed, 1 insertion(+), 1 deletion(-)
02:45 PM ~/java/git/gitolite-admin/conf [master ↑·1|✔] $ git push
Counting objects: 10, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (4/4), 385 bytes | 0 bytes/s, done.
Total 4 (delta 1), reused 0 (delta 0)
To git@git.mycompany.com:gitolite-admin
   040e849..9d8f0ae  master -> master
02:46 PM ~/java/git/gitolite-admin/conf [master|✔] $ git revert 9d8f0ae

You can test that hash-code is the same by review of logs before revert:
02:45 PM ~/java/git/gitolite-admin/conf [master|✔] $ git log
commit 9d8f0ae3a56152067341c4f2c301c4d232dcbda0

Load all json files from folder to MondoDB

Here is one way to quickly load number of json files to DB (MongoDB) for quick data mining:

#!/usr/bin/bash

ls -1 *.json | sed 's/.json$//' | while read col; do
   ./mongoimport -h mongodb.mycompany.com --db mytests --collection news --file $col.json --jsonArray
done


SQL to investigate data, "news" is collection, 'Headline' is filed in json:
db.news.find({Headline: {$regex: " for period end .*"}},  {Headline:1, _id:0})

Links:
http://docs.mongodb.org/manual/core/import-export/

How to remove trailing zeros from csv files


I had a task to do extraction from Oracle DB to CSV files and then remove all trailing zeros at last column.

Trim trailing zeros from all file:
ls $OUT_PATH/*.csv | while read file; do
   sed -i 's/[ ]*$//;s/\.00$//;s/\.0$//;s/\(\.[0-9]\)0$/\1/' ${file}
done


Script that do extraction to file by sqlplus:

Bash script:
#!/usr/bin/env bash

CONNECT="user/password@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=SERVER)(Port=1521))(CONNECT_DATA=(SID=my_sid)))"

sqlplus $CONNECT @extract.sql result.csv


SQL file (extract.sql):

set echo off
set feedback off
set verify off
set pagesize 0
set head off
spool '&&1'
select /*csv*/ name||','||TO_CHAR(value,'fm999G999G990D00') from my_table;

Explanation why Oracle can not do human readable formatting for decimal numbers:
Require a Number Format Mask to show leading zeros on decimals
Number format (trailing zeros)

Saturday, February 15, 2014

New Check: Grammar in names and negative names restrictions

One more ambitions for Checkstyle - check grammar in method names and negative names restrictions.

Reason:
- developers do typos
- not all developers are good in English, and this nightmare to other team members who English by native
- Force developers to avoid negative name usage "isIncorrect()", "isNotIgnoredClass()"

Why negative name is bad approach - as it force to use that Negative Logic in other places.

Example:

Negative logic/name:
public boolean isNotCorrect() {...}

force other to have code like :
if (isNotCorrect())  // OK
if (!isNotCorrect())  // is NOT OK - it is unreadable!!!

So lets remove negative logic/name:
public boolean isCorrect() {...}

force other to have code like :
if (!isCorrect())  // it is OK
if (isCorrect())  // it is OK

BUT live is not that ideal sometime it is convenient to use negative logic or even more it is dependency from thirdparty code or legacy code.

That idea depends on grammar check in names base on vocabulary http://grammarist.com/usage/negative-prefixes/

we need options to check "boolean methods" "setter/getter" .... user might not need to test all names, option to check base on visibility - public/private/... .


http://www.liferay.com/community/wiki/-/wiki/Main/Javadoc+Guidelines#section-Javadoc+Guidelines-General+Guidelines , search for "Refer to parameters with "the", not "a" or "given"" - it will be good validation

Class: Initial and detailed description # "The first sentence of the initial class description starts with a verb (two verbs, in fact)."




If you run to this page and share idea with me, please let me know.

Friday, February 7, 2014

Different file size on server and local file system

I got a case when file on server (Solaris) have different size from what I have on Local PC (Ubuntu 12.04),

[user@server ~]$ du -h /var/tmp/file.csv
548K    /var/tmp/file.csv
[roman@laptop ~]$ du --apparent-size -h /var/tmp/file.csv
1.9M    /var/tmp/file.csv

[user@server ~]$ ls -lh /var/tmp/file.csv
-rw-r--r-- 1 sb sb 1.9M Feb  7 13:59 /var/tmp/file.csv
[user@server ~]$ du -h /var/tmp/file.csv
548K    /var/tmp/file.csv
[user@server ~]$ du --apparent-size -h /var/tmp/file.csv
1.9M    /var/tmp/file.csv

from man:
    --apparent-size
              print apparent sizes,  rather  than  disk  usage;  although  the
              apparent  size is usually smaller, it may be larger due to holes
              in (`sparse') files, internal  fragmentation,  indirect  blocks,
              and the like