IIS: Odd behavior of modules

Refresh

April 2019

Views

49 time

1

I have a few legacy websites and each has a LOT Of static HTML pages. I would like to use IIS module to capture the generated page content and add additional HTML snippets to make it have new header and footer (this is called the decorator pattern). Here is the code I have for the module. The odd thing is that in many tests, I notice that the module is invoked TWICE when a page is loaded and each invocation passes part of the content of the page to the module (the first invocation passes the top portion of the page and the second the remaining portion of a page). The reason I know the module is invoked twice is because I used a static variable to capture the number of invocation and show it in the new header and footer (the two numbers are different and the footer number is always 1 larger the header number). I was also able to export page content into two different files to prove it.

namespace MyProject
{
    public class MyModule : IHttpModule
    {
        public void Dispose()
        {
        }

        public void Init(HttpApplication application)
        {
            application.ReleaseRequestState += new EventHandler(this.My_Wrapper);
        }

        public String ModuleName
        {
            get { return "MyProject"; }
        }

        public void My_Wrapper(Object source, EventArgs e)
        {
            HttpApplication app = (HttpApplication)source;
            HttpContext context = app.Context;
            HttpRequest request = context.Request;
            string requestPath = request.Path.ToString();

            //I have guarding code here so that the following code only applies to 
            //web requests that has ".html" in the end.

            HttpContext.Current.Response.Filter = new WrapperFilter(HttpContext.Current.Response.Filter);
        }
    }

    public class WrapperFilter : MemoryStream
    {
        private static Regex startOfBody = new Regex("(?i)<body(([^>])*)>", RegexOptions.Compiled | RegexOptions.Multiline);
        private static Regex endOfBody = new Regex("(?i)</body>", RegexOptions.Compiled | RegexOptions.Multiline);

        private Stream outputStream = null;

        private static int index = 0;

        public WrapperFilter(Stream output)
        {
            outputStream = output;
        }

        public override void Write(byte[] buffer, int offset, int count)
        {
            string contentInBuffer = UTF8Encoding.UTF8.GetString(buffer);
            string page = new StringBuilder(contentInBuffer).ToString();
            byte[] outputBuffer = null;
            Match matchStartOfBody = null;
            Match matchEndOfBody = null;

            index++;

            matchStartOfBody = startOfBody.Match(page);
            string header = "html snippets for header: " + index;
            page = startOfBody.Replace(page, "<body " + matchStartOfBody.Groups[1] + ">" + header);

            matchEndOfBody = endOfBody.Match(page); 
            string footer = "html snippets for footer: " + index;
            page = endOfBody.Replace(page, footer + "</body>");

            outputBuffer = UTF8Encoding.UTF8.GetBytes(page);
            outputStream.Write(outputBuffer, 0, outputBuffer.Length);
        }
    }
}

Question:

  1. The reason that the module is loaded twice is because the page content is too large or I need to increase the cache? If so, how?

  2. Technically, is my approach going to work? I was able to decorate HTML pages and because of the two invocations process, I am unable to handle some advanced situations.

  3. When an image needs displayed in a browser page, and the request for the image goes through IIS modules ?

UPDATE

Based on the valuable input from usr, the "odd" behavior is just IIS's normal behavior. Because of his/her suggestion, I added a class variable:

private byte[] allContent = new byte[0];

and the following updated method:

    public override void Write(byte[] buffer, int offset, int count)
    {
        //new bigger array
        byte[] newArr = new byte[allContent.Length + buffer.Length];
        //copy old content
        System.Array.Copy(allContent, newArr, allContent.Length);
        //append new content
        System.Array.Copy(buffer, 0, newArr, allContent.Length, buffer.Length);
        //reset current total content
        allContent = newArr;
    }

and add a new method with all the code copied from my earlier Write method:

    protected override void Dispose(bool disposing)
    {
    //code copied from my earlier code, with "buffer" changed to "allContent".
    }

Now everything works! Thank you, usr!!!

1 answers

1

OK, I should have solved this earlier. I admit I did not read every sentence of the question. I should have grown suspicious of the measurement. Turns out the measurement is broken.

Thanks for asking the question about whether the page size matters. I did tests again. It does. For small pages, I see the same number in header and footer. For large pages, I see 3 and 4 or something like that.

Then:

    public override void Write(byte[] buffer, int offset, int count)
    {
        //...

        index++;

Write might be called an arbitrary number of times. This is a Stream implementation. Anyone can call Stream.Write as often as he wants to. You would expect that with any Stream.

The index can be incremented many times per page. The counting code is broken, the rest works.

Also, the UTF-8 processing is broken because you can't split UTF-8 encoded data at arbitrary boundaries.

usr