Skip to content

Slow SUMIFS #2266

@RazorTag

Description

@RazorTag

EPPlus usage

Commercial use (I have a commercial license)

Environment

Windows

Epplus version

8.4.1

Spreadsheet application

Excel

Description

I am encountering very slow formula calculation of some SUMIFS. A performance improvement would be great, but a better understanding of known performance issues would also be valuable. I produced a very simplified example:

  • The test EPPlusCalculateFormulas30s has 10,000 uses of the formula "=SUMIFS([A],[A],[@A])" and takes roughly 30 seconds to calculate using EPPlus. If I increase the number of rows, performance scales with roughly the square of the number of rows/formula occurrences.
  • The test EPPlusCalculateFormulas1s has 10,000 uses of the formula "=SUMIFS([A],[A],1)" and takes roughly 1 second to calculate using EPPlus. Performance seems to scale linearly,
  • Desktop Excel calculates either of these workbooks in what seems to be less than a second. Excel's performance seems to scale linearly with either formula.

The tests below assume that you have the two workbooks from this zip file in the EPPlusTest directory within your user directory:
SUMIFS([A],[A],[@A]) {30s}.zip

using System;
using System.Diagnostics;
using System.IO;
using NUnit.Framework;
using OfficeOpenXml;

namespace ManualTests.Excel
{
    [TestFixture]
    public class EPPlusTests
    {
        private const string EPPlusTestDirectory = "EPPlusTest";
        private readonly string _testDirectory = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.UserProfile), EPPlusTestDirectory);

        [Test]
        public void EPPlusCalculateFormulas30s()
        {
            string filePath = Path.Combine(_testDirectory, "SUMIFS([A],[A],[@A]) {30s}.xlsx");
            using var excel = new ExcelPackage(filePath);

            var timer = Stopwatch.StartNew();
            excel.Workbook.Calculate();
            excel.Save();
            timer.Stop();

            Assert.That(timer.Elapsed, Is.LessThan(TimeSpan.FromSeconds(3)));
        }

        [Test]
        public void EPPlusCalculateFormulas1s()
        {
            string filePath = Path.Combine(_testDirectory, "SUMIFS([A],[A],1) {1s}.xlsx");
            using var excel = new ExcelPackage(filePath);

            var timer = Stopwatch.StartNew();
            excel.Workbook.Calculate();
            excel.Save();
            timer.Stop();

            Assert.That(timer.Elapsed, Is.LessThan(TimeSpan.FromSeconds(3)));
        }

        [Test]
        public void EPPlusCalculateRefError()
        {
            using var excel = new ExcelPackage();
            var sheet1 = excel.Workbook.Worksheets.Add("Sheet1");
            sheet1.Cells[1, 1].Value = "A";
            sheet1.Cells[1, 2].Value = "B";
            var table = sheet1.Tables.Add(sheet1.Cells["A1:B10001"], "Table");

            for (int rowNumber = 2; rowNumber <= 10001; rowNumber++)
            {
                sheet1.Cells[rowNumber, 1].Value = 1;
                sheet1.Cells[rowNumber, 2].Formula = "=SUMIFS([A],[A],[@A])";
            }

            var timer = Stopwatch.StartNew();

            excel.Workbook.Calculate();
            string filePath = Path.Combine(_testDirectory, "SUMIFS-#REF!.xlsx");
            excel.SaveAs(filePath);

            timer.Stop();
            Assert.That(timer.Elapsed, Is.LessThan(TimeSpan.FromSeconds(3)));
        }
    }
}

The test EPPlusCalculateRefError tries to recreate the "SUMIFS([A],[A],[@A]) {30s}.xlsx" workbook programmatically. EPPlus turns each occurrence of the formula "=SUMIFS([A],[A],[@A])" into "=SUMIFS([A],[A],#REF!)". I ran into this while writing a unit test to reproduce the performance issue described above. This may not be relevant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions