Guru: Web Services, DATA-INTO and DATA-GEN, Part 2 – IT Jungle

April 12, 2021
In Part 1 of this series I discussed the use of DATA-GEN and DATA-INTO to create “blog entries” via a web service. This time I am going to focus on using the GET HTTP method to retrieve blog entries. As you will see the basic process is very similar.
I am going to start by retrieving a single blog post, but rather than retrieve all of the data associated with the post I will show you how to restrict processing to specific elements. I will then move on to look at two approaches to processing multiple posts. The first processes all of the data in one go, the second processes it in “batches.”
The web service has the same base URL that we used before, but we will be using the GET HTTP method rather than the POST we used before. For a GET request, the web service checks to see if the URL has a post number at the end of it (i.e., after the final “/”). If it does, that specific post number will be retrieved. If you want to see what the result would look like, click this link. You should see the “raw” JSON data for blog post 15. If there is no post number present then all available blog posts will be returned. You can see this in action by clicking on the first link in this paragraph. We will be working though examples of both cases.
This story contains code, which you can download here.
Of course JSON returned in the browser in this way, and while useful to check if the web service works without writing any code, it is not terribly practical. So, I will use DATA-INTO to parse it, just as before. As you have just seen, for blog post 15, the web service returns JSON that looks like this:
However, for the purposes of this exercise I am going to assume that I have no need for the data contained in the “body” element. Omitting it from processing will reduce the amount of memory used by the program as this element can obviously be quite large. It will also speed up the processing a little — so this is the DS that I am going to be using:
As I noted earlier, for this service we are using the GET method, and any parameters are simply part of the URL. As a result the HTTPAPI call is much simpler. Here’s what that processing looks like now:
Once we get the response from the web service we use DATA-INTO to populate the DS. The code looks like this:
But there is a problem. If you were to compile and run this, you would receive a run time error to the effect that “The document for the DATA-INTO operation does not match the RPG variable; Reason code 5.”
And if you were to check you would find that code 5 is defined as:
“5. The document contains extra names that do not match subfields.”
The “extra names” of course refers to the “body” element in the JSON that I decided not to process. In order for DATA-INTO to process this document I will need to add the %DATA option ‘allowextra=yes’. This tells RPG that it is OK if there are elements in the JSON source that have no match in the target DS. You need to exercise a little caution when using this option as it does not provide for any granularity. That is to say that there is no way of saying: Allow the “body” element to be ignored, but all others must be present. As a result, if the response from the web service were to change and include additional elements we might never know about it because the “allowextra” option would also cause them to simply be ignored.
The resulting code looks like this:
And that is all there is to it. If you study the complete code for RPG program USEWEBSRV2 (which can be downloaded here) you will see that the program uses a Monitor group to trap any errors signaled by HTTPAPI. For this particular web service an attempt to retrieve details for a non-existent blog post results in HTTPAPI throwing an error. For simplicity I am simply treating all errors as indicating a “Not found” status for the requested post. With many web services this simplistic approach to error handling will be sufficient. For others I might need to process the exact error message returned and act accordingly. I will discuss how to do this in later tips.
Earlier I mentioned that if there was no blog post number at the end of the URL, the web service would respond with a list of all blog posts. So how can I process that list?
There are two approaches to handling lists like this. All at once, and a “chunk” at a time. The first is best suited when you can either control the number of items that will be returned, or know that there can never be more than a given number. The second is useful when the number of results is unknown. It is also useful if the data can be processed in pieces — for example if the data returned were to be displayed in a subfile. Prior to the advent of IBM i V6 with its greatly increased size limits, it was also often necessary to use this second approach to facilitate processing results that included very large fields (descriptions for example). The new higher capacity limits reduce the number of times when we need to do this.
Let’s start by looking at the “all at once” approach. In simple cases, such as our example, all we need to do to facilitate this is to have DATA-INTO target a DS array. So the target definition changes to this:
How can we tell how many elements of the array were loaded by DATA-INTO? The RPG compiler facilitates this by providing an 8-byte count (an unsigned 20-digit integer) starting at position 372 in the Program Status Data Structure. In my program I defined it like this:
And that is all that we need. After successfully invoking the web service I can then use the value in itemCount to control the processing of the array elements. In my test program USEWEBSRV3 I used it to control the number of elements scanned by the %LOOKUP operation, as you can see below.
In essence the process is simple — we merely have to change the target of the DATA-INTO operation so that rather than targeting a DS we instead identify a handler subprocedure. It is this subprocedure that will receive the “chunks” or data. So we change from this:
To this:
Notice that %HANDLER has replaced the DS name responseData. As far as DATA-INTO is concerned no other change is needed. The first parameter to %HANDLER identifies the subprocedure to do the processing. The second is known as the communications area and it is used to pass information between the code containing the DATA-INTO and the processing subprocedure. This is needed because your RPG code is not invoking your subprocedure directly, but rather indirectly via the RPG run-time. The use of such a value will be more obvious when we look at the code for the handler subprocedure.
I said that we would be processing the data in “chunks,” so how do we control the size of the chunk? The answer is very simple, although perhaps not immediately obvious. We control it by specifying in the subprocedure’s interface the number of elements to be handled at a time via a DIM. Here’s the procedure interface for my test program USEWEBSRV4.
There are three parameters passed by RPG to my subprocedure.
The other thing to note is that the procedure is defined as returning a four byte integer ( Int(10) ). This provides a method for the subprocedure to notify the RPG run time that it should abandon processing. I have never found a need to do anything other than return a value of zero which tells RPG to keep processing.
Within the subprocedure I am just doing minimal processing to demonstrate that the date was received. Here’s the code:
I start by displaying the variable items, which contains the count of the number of elements loaded by RPG into the DS passed as the second parameter. Next I display the user ID and blog post ID from the first item in the DS array. I then add the number of items being processed on this call to the count variable (i.e., the communications area). In my example I am using this to make the total number of elements processed available to the mainline code. In the previous example RPG supplied this count in the PSDS as noted earlier. Because we are processing in “chunks” RPG can no longer supply that value and so it is up to us to build it if we need it. If you study the source you will see that the value is subsequently displayed back in the mainline following the DATA-INTO operation.
As you can see, the combination of the HTTPAPI and RPG’s DATA-INTO can make it very easy to interact with web services. Obviously the example I have used here is a simple one, but it should give you a good idea of the basics. When dealing with more complex requirements, particularly ones involving nesting of elements, the biggest challenge is often defining the receiving data structure. As you gain experience this will become second nature to you, but when starting out you may find Scott Klement’s utility YAJLGEN useful. This tool is shipped with the YAJL library. It takes as input a sample of the anticipated response JSON and generates its best guess at the required DS. In fact, it generates a complete DATA-INTO test program. Of course, when it comes to the size of fields, and arrays it is very much a guess, something that the comments included at the beginning of the generated DS make abundantly clear.
In the next part of this series I will be looking at how DATA-INTO and DATA-GEN can be used to process JSON that includes elements with names that do not map to RPG names. I will also touch on some additional features of the YAJLINTO parser and the YAJLDTAGEN generator.
In the meantime, if you have any questions or if there are any particular aspects of this very broad topic you would like me to delve into in future tips please let me know.
Jon Paris is one of the world’s foremost experts on programming on the IBM i platform. A frequent author, forum contributor, and speaker at User Groups and technical conferences around the world, he is also an IBM Champion and a partner at Partner400 and System i Developer. Until Covid-19 messed everything up he hosted the RPG & DB2 Summit with partners Susan Gantner and Paul Tuohy. These days he has to be content with everything being done over Zoom. That includes the upcoming Summit Hands-On Live! Workshops, and the Virtual RPG & DB2 Summit.
Web Services, DATA-INTO and DATA-GEN, Part 1
Tags: Tags: , , , , , ,
Do the Math When Looking at IBM i Hosting for Cost Savings
COVID-19 has accelerated certain business trends that were already gaining strength prior to the start of the pandemic. E-commerce, telehealth, and video conferencing are some of the most obvious examples. One example that may not be as obvious to the general public but has a profound impact on business is the shift in strategy of IBM i infrastructure from traditional, on-premises environments to some form of remote configuration. These remote configurations and all of their variations are broadly referred to in the community as IBM i hosting.
“Hosting” in this context can mean different things to different people, and in general, hosting refers to one of two scenarios. In the first scenario, hosting can refer to a client owned machine that is housed in a co-location facility (commonly called a co-lo for short) where the data center provides traditional system administrator services, relieving the client of administrative and operational responsibilities. In the second scenario, hosting can refer to an MSP owned machine in which partition resources are provided to the client in an on-demand capacity. This scenario allows the client to completely outsource all aspects of Power Systems hardware and the IBM i operating system and database.
The scenario that is best for each business depends on a number of factors and is largely up for debate. In most cases, pursuing hosting purely as a cost saving strategy is a dead end. Furthermore, when you consider all of the costs associated with maintaining and IBM i environment, it is typically not a cost-effective option for the small to midsize market. The most cost-effective approach for these organizations is often a combination of a client owned and maintained system (either on-prem or in a co-lo) with cloud backup and disaster-recovery-as-a-service. Only in some cases of larger enterprise companies can a hosting strategy start to become a potentially cost-effective option.
However, cost savings is just one part of the story. As IBM i expertise becomes scarce and IT resources run tight, the only option for some firms may be to pursue hosting in some capacity. Whatever the driving force for pursing hosting may be, the key point is that it is not just simply an option for running your workload in a different location. There are many details to consider and it is to the best interest of the client to work with an experienced MSP in weighing the benefits and drawbacks of each option. As COVID-19 rolls on, time will tell if IBM i hosting strategies will follow the other strong business trends of the pandemic.
When we say do the math in the title above, it literally means that you need to do the math for your particular scenario. It is not about us doing the math for you, making a case for either staying on premises or for moving to the cloud. There is not one answer, but just different levels of cost to be reckoned which yield different answers. Most IBM i shops have fairly static workloads, at least measured against the larger mix of stuff on the public clouds of the world. How do you measure the value of controlling your own IT fate? That will only be fully recognized at the moment when it is sorely missed the most.
Please visit for more information.
800.211.8798 | [email protected]
Article featured in IT Jungle on April 5, 2021

Brilliant article. Thanks Jon.

Copyright © 2021 IT Jungle