Wednesday, January 29, 2014

ETL Work - Talend

Extract Transform and Load (ETL) is a general concept of how to move data from one system to another.  There are a lot of tools, and a lot of different methodologies to successfully accomplish this.  But I want to touch upon a product and ECO system that I've never heard of before, until 4 months ago.

Talend is a open source eclipse project that allows you to design ETL work in a visual capacity.  It stores its' configuration within XML files, and when you are ready to execute, for debugging or production, it then generates java code and compiles it.

My group has bought into the entire Talend ECO system, which combines many different open source projects into an nicely integrated product.

But I'm a ColdFusion lover, and this blog is about ColdFusion, therefore, there is two integration points that Talend provides.

First, if you buy into the ECO System, then you'll get a Administration server that schedules you, and then you can use modules the consume web services and we all know how awesome ColdFusion is at building web services!

Second, Talend Studio - the eclipse development UI, is free.  Using this free product will provide you an ability to write ETL jobs in java, compile them and then execute them from ColdFusion.

Both scenarios allow you to be a novice JAVA developer and use the powerful more complicated parts of JAVA.  For example, it is a single click to on parallelism for the entire job, therefore, long running jobs would finish in a portion of the time.

Lastly, The use-case that we are using this tool is to transition from one system to another.  And because of the scope of the project it won't be one quick cut over.  Therefore, having the ability to run the ETL job over and over and over (you get it), while developing, testing, and then in production during the transition phase is very powerful.  Along with having the entire java core to be able to transform the data is a great resource to have under your belt.

Monday, November 18, 2013

Switch from SVN to GIT

I've been using github.com and bitbucket.com for a while now, but less than a year ago, the development group i'm with decided to switch to GIT.  And I was nervous due to the fact that most of the work I did was in solitaire on github and bitbucket.  I never really was on the opposite end where I was managing and supporting other developers with GIT.

But Ignoring my nerves, I started to really review best (or common) practices of GIT and how the distributed nature of it.  The distributed nature still scares me a little, due to the fact that one developer could really F'up the central repository (depending on your setup).  Atlassian's Stash was chosen as the GIT managing server, that is due to the fact that it adds features to GIT that you would normally have to do manually.  I'll jump into Atlassian's Stash a little later.  For now, just the SVN to GIT conversion.

There are a lot of tutorials on how to migrate your SVN repository into GIT, my article isn't to guide you through that, but rather the common developer's gotchas because you have to think slightly different when using GIT versus SVN.

The big gotcha that keeps getting me, is that it is distributed.  A commit is only local, no other developer will see it until it is "pushed", to a central or other GIT repository.  The next gotcha is language itself that comes around GIT.  Commit, Push, pull, and fetch are common terminology behind GIT, make sure you read the definitions and understand them.

Lastly, the last gotcha that I keep having is along with the distributed model.  You need to remember to keep your local repository updated with the other "central" repository.

The last thing I would like to point out is having a defined "workflow" for you GIT repositories is key.  Due to the fact that GIT is distributed, you can do a lot of work in a different fashion. There is NOT a one fit all mentality here but there is a solution for everyone, just look around.  Here are some resources for GIT workflows:

GIT is flexible and can be adjusted as you go.  Put a vision in place for the workflow and take small implementations of that vision.  In my opinion the transition from SVN to GIT using GIT workflows will greatly impact your development practices.

Tuesday, December 18, 2012

How my world has changed (personally)

In April 2012 doctors found a brain tumor in my 7 year old son, and within a 3 days they removed it.  I try to keep my professional career and personal life separate in many ways.  But I believe something this big should cross those boundaries.  This will be my only post, on my technical blog, so feel free to follow my personal blog if you would like to stay in touch.

As of Today, I feel very lucky how the events have turned out, although this trip my family are on hasn't stopped our situation could have been a lot different.

I've been trying to compile the events in my personal blog (www.lonestarbandit.com).


Tuesday, December 4, 2012

Railo Frameworks within Subfolders - file not found

I use coldbox, and I am just now looking at using Railo with tomcat on Linux.   I have a development machine where I am loading different test cases into a single web root.  For example

/contentBoxTest/
/TestIntranetSite/
/TestPublicSite/

The main handler worked, but all subsequent handler's failed, including modules.

I found out that Railo/Tomcat doesn't have the greatest wildcard processor for handling frameworks with SES.  for example /contentBoxTest/index.cfm/help/me

So I found that for each subfolder that we want the index.cfm to be processed in a SES way, we have to add a line to the configuration file.

For mine, a default install the config file is located at

/opt/railo/tomcat/conf/web.xml

there is a structure within the xml file with within it

mine now looks like this



        GlobalCFMLServlet
        *.cfm
        *.cfml
        *.cfc
       
        /index.cfm/*
        /default.cfm/*
        /post.cfm/*
        /archive.cfm/*
        /blog.cfm/*
        /page.cfm/*
        /rewrite.cfm/*
       
        /TestPublicSite/index.cfm/*
        /TestIntranetSite/index.cfm/*
        /contentBoxTest/index.cfm/*
   



Thursday, May 24, 2012

Amazon AWS Railo Setup - A Trial

After looking at ColdFusion hosting providers and the extreme costs and limitations they through at you, I've decided to look at running my own in the Cloud.  Amazon AWS set of features and flexible price structure, including a free first year for development, made the decision easy to investigate first.

I got to say at first I was a little overwhelmed with the naming conventions of AWS products.  But after stumbling and reading some articles, the terminology actually fits.  Once I got a machine running (Ubuntu 32bit), which was really easy with the "launcher" wizard AWS (EC2) provides), the challenge of connecting became very obvious.  I have used SSH in the past, I have always used it with a password.  AWS by default does NOT allow this, key pairs are required.  Again after a couple quick searches, I found Putty was the easiest.  I wish I had the main page I used, but I can't find it in my browser history.  Anyways, you take the PEM file given to you from Amazon (you can only download it once, so make sure you back it up).  And have PUTTY convert it to a putty file.  Once there, use putty to connect to the host name with the Key Pair file.

Now once you are connected, there is no UI, so linux command knowledge is required.  You don't have to be an expert, but knowing the basics help!

I used  blog post http://blog.nictunney.com/2012/03/railo-tomcat-and-apache-on-amazon-ec2.html to install and configure Railo, Tomcat, and Apache2.  The only commands he doesn't have in his blog is the change permissions command.

chown -R railo /var/www

and the restarting of the services apache2 and railo

sudo services apache2 restart
sudo /opt/railo/railo_ctl restart

I now have Railo setup on this server.  My next steps will be setting MySQL and getting Coldbox and ContentBox running on this server.

Thursday, November 10, 2011

Creating an environment safe wrapper for cfmail

The one thing that always makes me nervous when testing other people's code, is who is this going to email? Well many years back I created a cf_email custom tag that wrapped cfmail with some special if logic before it to determine if it was in production or different environment. The logic for is easy

comparenoCase(server.serverLevel,"PRODUCTION") EQ 0

and there was another script that got included in the application.cfm file "server.cfm" that loaded the server variables.
But the issue I ran into lately, was I needed to use one of the newer features within cfmail that wasn't hard coded into my wrapper. So knowing that coldfusion added the support to attributeCollection=attributes and seeing the Ray Camden had a nice blog post on CFASSOCIATE. I asked one of my co-workers to work on it :-D But anyways, we came up with the below code.

The below code allows the same features within coldfusion to be used for within the customTag, but now we can add additional logic to verify environment and only send the program's email in production, and tweak the TO,CC,BCC fields if not within production.
Test Code
<cf_email to="endUser@cfhero.com" from="help@cfhero.com" subject="This is only a test">

    <cf_emailparam file="#expandPath("./web.jpg")#" disposition="inline" contentID="image1">
 
    There should be an image here <img src="cid:image1" />

</cf_email>
cf_email
<cfif CompareNoCase(thisTag.hasEndtag,"NO") EQ 0>
 <cfthrow message="cf_email requires end tag" />
</cfif>

<cfparam name="attributes.from">
<cfparam name="attributes.to">
<cfparam name="attributes.subject">


<cfif CompareNoCase(thisTag.executionMode,"END") EQ 0 >

<!--- add customization here for non-production use --->
 
<!--- blank the generated content out and move it into a variable --->
<cfset variables.GeneratedContent = thisTag.GeneratedContent>
<cfset thisTag.generatedContent = "">
 
<cfmail attributecollection="#attributes#" from="#attributes.from#" subject="#attributes.subject#" to="#attributes.to#">
 <cfif isDefined("thisTag.emailParam")>
  <cfloop array="#thisTag.emailParam#" index="attrCol">
   <cfmailparam attributecollection="#attrCol#" >
  </cfloop> 
 </cfif>
#variables.GeneratedContent#
</cfmail>
</cfif>


cf_emailParam
<cfif CompareNoCase(thisTag.hasEndtag,"YES") EQ 0 AND CompareNoCase(thisTag.executionMode,"END") EQ 0 >
 <cfexit method="exittag" />
</cfif>

<cfassociate basetag="cf_email" datacollection="emailParam">


Tuesday, November 1, 2011

Null Values within SQL Select Statement with CFQUERYPARAM

I was working on code where I wanted to store nulls into the database and query based on those nulls.

<cfquery name="qRetrieve" datasource="#getDSN().getName()#">
 select semesterID,Major_ID,Campus_ID,AdmTypeReqID,startDate,startOrEndFlag
 from isFull
 where 
  semesterID = <cfqueryparam value="#arguments.DTO.getsemesterID()#" cfsqltype="CF_SQL_INTEGER">
  AND Campus_ID = <cfqueryparam value="#arguments.DTO.getCampus_ID()#" cfsqltype="CF_SQL_INTEGER">
  AND startOrEndFlag = <cfqueryparam value="#arguments.DTO.getstartOrEndFlag()#" cfsqltype="CF_SQL_INTEGER">
  AND AdmTypeReqID= <cfqueryparam value="#arguments.DTO.getAdmTypeReqID()#" cfsqltype="CF_SQL_INTEGER" null="#iif(arguments.DTO.getAdmTypeReqID(),true,false)#">
  
</cfquery>



Notice the last line is using the function IIF instide the null attribute of cfqueryparam.  In the SQL Insert and Update statements, this just works.  I am noticing in interacting with MSSQL 2005 this is not working.  I am getting zero rows back when it should be in the 5-10 rows. For now I have replaced my last line of code with the following 5 rows of code

AND 
   <cfif arguments.dto.getAdmTypeReqID() EQ "">
   AdmTypeReqID is null
   <cfelse>
   AdmTypeReqID = <cfqueryparam value="#arguments.DTO.getAdmTypeReqID()#" cfsqltype="CF_SQL_INTEGER" >
   </cfif>

Has anyone else ran into this?