OSGi Services over top, this gets a bit more interesting.
OSGi Services in many ways act like singletons instead of regular classes. When you fetch a Service in OSGi, you are not getting a new instance of the service class, but the Component class instance registered for the Service interface.
This distinction is important! Because the Service instance is shared, any member variables are also shared between invocations of the service. One of the most common places I’ve seen issues with this is Sling Servlets. Since they are registered as OSGi Services, you cannot use member variables in a Sling Servlet.
Let’s see a simple example of this issue. I’ll create a simple servlet which uses a member variable to store user submitted information:
@Component(service=Servlet.class,
property={
Constants.SERVICE_DESCRIPTION + "=Concurrency Demo Servlet",
"sling.servlet.methods=" + HttpConstants.METHOD_GET,
"sling.servlet.paths="+ "/bin/concurrencyservlet.dangerzone"
})
public class ConcurrentServlet extends SlingSafeMethodsServlet {
private static final long serialVersionUid = 1L;
private String lastValue;
@Override
protected void doGet(final SlingHttpServletRequest req,
final SlingHttpServletResponse resp) throws ServletException, IOException {
String val = req.getParameter("value");
resp.setContentType("text/plain");
resp.getWriter().write("Current Value = " + val+"\\n");
resp.getWriter().write("Last Value = " + lastValue);
lastValue = val;
}
}
Here’s an example of the servlet in action:
As you can see, the lastValue member variable is shared between requests. Imagine instead, this was a credit card number or Personally Identifiable Information!
The simplest fix is to not use member variables in OSGi Services! If you have multiple fields you want to save and pass to methods, creating a simple POJO can help to make this easier so you’re not passing in a large number of parameters.
The AEM repository is the ultimate global variable. Its state is shared across all of the code accessing the repository and when you write code which will not execute in predictable order/time and relies on the state of the repository, it can be difficult to ensure that you’re not going to run into concurrency issues.
For example, let’s say you had a process which worked as follows:
This process will work great when each invocation is allowed to complete before the next starts, but what happens if it kicks off by two different workflows at the same time? Or 10? Or 100? Imagine you have this process running on two pages, PageA and PageB, both of which reference each other.
The process starts on PageA which gathers that PageB references PageA, then runs the move. At the same time, PageB starts and notes that PageA references PageB and moves. Now that the move is complete, the process for PageA looks for PageB for references, but the page is no longer at the same path! Same thing with PageB when it tries to update references in PageA.
Depending on the timing and data, this process will work sometimes but fail other times, which makes the diagnosis even more difficult.
When you are making changes to the repository which may cause concurrency issues like the one described above, you need to make sure the entire process is completed before it starts the next update. There are two ways to ensure this:
The correct approach will depend on your needs and requirements, but the idea is the same — make sure that only one “item” is processed at a time.
Concurrency issues are challenging to identify, but knowing these common issues gives you a starting place to look to make sure your code is not affected by concurrency bugs.