Skip to content

Conversation

@tobithiel
Copy link

@tobithiel tobithiel commented Nov 3, 2017

The testing methods used inside spark-redshift are unfortunately all marked private and can't be used from the outside. We were looking for a way to run our tests against local mocks instead of the real online services. We got it working but had to do few minor adjustments to spark-redshift. Let me know if we could integrate these into the official version.
Our changes are:

  1. We need an ability to provide a custom S3 client factory to spark-redshift, so we added an extra constructor to DefaultSource. The current constructor also requires a JDBCWrapper, which is marked private and therefore doesn't seem to be able to be passed in from the outside.
  2. We're redirecting the Redshift queries to a local PostgreSQL instance (with a small compatiblity layer redshift-fake-driver). PostgreSQL unfortunately requires all subqueries to have an alias, which we added to the unload query generation.

Let me know what you think and if this could be merged in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant