Looking for answers which can be leveraged and understood BOOMI in detail to provide ETL process.
I'm only just starting out using Boomi (month or 2) I have done 2 days of jump start training .. I might attempt certification exam soon.
I have used Talend TIS, TAC in the past for last 7 years ... and also straight and plain Java in netbeans .. also a Delphi "replicate" application .. and Oracle PL*SQL for ETL.
My thoughts on Boomi at this stage are:
Pros : Platform is managed for you ... Process versions are managed for you .. Upgrades are provided for you .. Boomi does Web Services well .. Java under the cover .. True cloud app .. Can have runtime atom or molecule on a cloud server or an on premise server .. or both ..
Have completed first task using Boomi successfully
Great peer company reviews from people who have used Boomi for ~2years time frame ..
Cons : Not so sure you can go down to the pure Java type or level (which could be considered a big "Pro" by many people ... esp. managers .. ).. I have done lots of ETL by "source query vs destination query result - data comparison" type of method .. Am still not entirely sure of approach re. this but am thinking it is possible .. Am yet to determine how fast processes are for larger scale ETL (e.g. up in millions of rows) .. .. License costs are on a by connection count type of basis where as other ETL tools are by count of developer user numbers .. I haven't compared various ETL tool costs so can't comment on cost compare ..
Not sure yet about process monitoring ease ... to be determined ..
Hopefully you will get answers from other people also !
Horses for courses .. Hopefully Boomi will provide good value ..
Thanks for the inputs Allan.
I have done multiple ETL / BI projects the last 20 years. Mainly based on Oracle PL/SQL and dataware house solutions, in combination with BI tools from several vendors. So far, I have also used Boomi for ETL such as financial consolidation.
What I like about Boomi
What I don't like? I'm sure other solutions have benefits too, of course. I would expect that Boomi is king regarding connectivity. I also would expect certain tools ofter smarter solutions for aggregating data. But to be honest, I'm not up to date with the top offerings in this market. No time either to investigate it. Boomi-ng business
Based on your expertise I agree with you Sjaak Overgaauw
I was only wondering if a huge volume of transactions for ETL processing can be leveraged or should I look for bulky tools which are actually dedicated to Data Integration.
Molecule should be able to handle it but want to understand some cons before I completely agree BOOMI can serve the purpose of ETL processing.
For huge volumes, I would recommend to use local Atoms. We have processed 1.000.000+ records like this. Tune the Atom and your database backend well: memory, CPU's, IO etc. Sometimes other jdbc drivers and features can make a difference. Also tune your Boomi process, you will discover the difference when processing large datasets. Multi threading is another option of course, if your use case allows that. Anyway, there are lot's of things you can tune when you want to use Boomi for ETL.
We use Boomi for a ton of senarios and the ETL use case is something we have been fworking with alot.
What we have found is this:
If your just moving data from source A to source B with relatively clean data character set wise without alot of transformation, then boomi can work well for both small and large work loads and large. The problem with boomi as an ETL tool is, when you have transformation or data characterset / formatting issues. When boomi fails in the mapper the entire batch fails , not a specific record. If you try and run each record separately the performance effect becomes a real problem especially in medium to larger record count senarios.
The boomi mapper isnt as robust as other tools around complex transformations, character set conversions, and data type conversions and formatting. We have used it in some cases but for real ture ETL senarious we have chosen another tool.
I think the use of boomi in specific circumstances can be very sucessful but no one tool can do everything well. We love the tool and would love to use it more for ETL but just cant make it work well for many of our senarios.
I hope this helps
I'd be curious to know how boomi was failing with regards to character set issues? boomi should be able to handle any character set conversions necessary (as well as complex transformations, data type conversions and formatting).
We have seen multiple senarios where funky characters copied from word into database tables via Oracle ERP java forms cuases the mapper to fail even in simple data mapping. In other senarious the mapping functions fail when trying to reformat that kind of data.
The bigger issue however is the fact that you cant fail an individual record without processing record by record. This is not nearly as efficient even with various tweaking on the atom or molecule as compared with other tools in our experience.
We also have found for complex transformations error reporting and the mapper isnt as effective or performant as other tools
Do you mean Oracle EBS? So copying data from a Windows client (1252) to Oracle EBS using an older character set like 8859P1 or 8859P15? Which character sets does your system use? Application server and database have separate settings btw.
Anyway, I can't comment on how Boomi deals with it or why it behaves as you describe. I simply haven't used these character sets and this "funky data" with Boomi. UTF-8 is the more or less the standard nowadays so it's not happening that much anymore as it used to be. Migrating the system is the structural solution in this case. It usually means creating a new database and importing the old one. That's a lot of work so it's often done during a technology stack upgrade or a release upgrade. Or wait for the next ERP implementation if you have plans to phase out Oracle.
Obviously, the key here is defining the character sets well in the first place. As long as you define them correctly in boomi and implement the appropriate steps to convert between them, there shouldn't be an issue. you can get into trouble, however, if the character data has already been corrupted by one of the end systems (e.g. trying to store characters in a data store which is not using a character set which supports those characters). In that scenario, you may need to use another tool to attempt to "recover" those corrupted characters before using the standard character set conversions.
Out of curiosity: how does Boomi behave if you process already "corrupted" data? Example:
We have tried and done decoded data from various charactersets. Sometimes it has worked others it hasn't. It all depends on the funky characters that we are dealing with. My larger point is that boomi doesn't do a great job of addressing character set conversions on its own and more importantly if there is a mapping failure the whole batch fails. To get the combined performance and error handling you will need to effectively do good ETL to add multiple threads and play around with catching to batch the errors and etc. Its not nearly as efficient as other tools out there for this kind of purpose at this point in my opinion. If its the only tool you have then you can make it work but not ideal from a capability and performance standpoint.
Marc cohen How did you end up for record level lookup ?
An initial attempt at insert update ETL loader process using Boomi ... (still working on it ..) The delete part of the data sync to be another process .. with first query on target .. This might be useful to someone .. (processes ~4000 records in 26 seconds .. ) Will try some performance improvements .. publish/subscribe model maybe ..
Consider using add to cache and flow control shape to improve the performance even further
Thanks Vishwam ! Good advice ..
Yep - Will use a cache to half number of lookup SQL queries issued ...
Yep - have tried flow control with 5 threads and this improved performance a bit also ..
A new process diagram using 2 Cache objects ... one for source ... one for target is as per this diagram.
Processes ~10,000 records per minute ..
Just to add my 2c on this after using AtomSphere for ~1 year as an ETL solution replacing a big competitor ETL solution and porting some custom build ETL jobs. In my case I'm primarily using Boomi AtomSphere for moving data between SalesForce and backend DB used for reporting services. I know this is an older topic but hopefully this will help others.
- Fairly simple to learn yet powerful web UI.
- You don't have to code but have the ability to execute Groovy scripts on your data (ex: add complex cleanup logic).
- Prod & Test environments are included in the package. You know you will need to test somewhere yet competitors typically charge you extra for test environment.
- Email alerts can be setup to report process failures and when Atoms go offline without needing to put that in to the process themselves. Also, email alerts combine multiple errors/failures in a single email when they occur close to each other!
- No IO/JDBC batching capability in the Database connector (at least in current v17). This is a big one for me and anyone who is not able to place the Atom right next to the DB server. See my comments under this post and vote on this enhancement. Until this is implemented I'm not able to use Boomi for ETL jobs agains remote DB servers (ex: over VPN).
- No easy way of performing MERGE/UPSERT on the database. The easiest solution I found was to use DELETE+INSERT statements in the same DB operation or writing lengthy SQL.
- Updating Salesforce operations to add new custom fields could be made much easier. Custom SOQL in v18 of this connector helps but that has it's own limitations as well (see this).
- Often I'm not able to see executing process state until it's closer to the end of its execution (ex: it's performing DB insert operation which is at the end of the process).
- High CPU usage by Salesforce connector no matter what the volume is (see this thread).
Retrieving data ...