Blog

Network changes coming up. IP addresses will change on June 27th 2024

We are making some changes to our infrastructure, which leads to new IP addresses necessary to connect to us. Most users will not be affected by this change.

 

If you only use our default URL (something.thegood.cloud) to connect to your environment, this change has no impact on you. However, if you are using an own domain name URL (e.g. cloud.something.com) to connect to your environment, or use IP Filtering you need to act.

 

If you are using an own domain, you must have set a DNS record to point it at our network. Please change the IP address of that record to the new IP Address before before June 27th 2024.

We will send out emails to all Consumer and Business clients a.s.a.p. with the new IP Addresses.

 

If you require any help or guidance in this regard, please reach out to help@thegoodcloud.nl

 

The Good Journal #9 A Quest for Good AI

Soon, we’ll be rolling out Nextcloud server v27 as our production version and will start slowly rolling out the upgrade to every environment. With that, some neat new features have been introduced, most noticeably the AI features.

We’re not too fond of the general usage of the term “AI”. This is not the new Skynet, and it’s not on the way to creating terminators. We like “Large Language Model” or “LLM” much more. Because that is all it is, a big box of data with an index able to give you a guestimation of what you’d likely want to see next based on the input you’ve provided. At first glance, it may pass a Turing test, but it has no intelligence. https://plato.stanford.edu/entries/chinese-room/

 

 

The difference in Ethics between development and hosting

 

Nextcloud has set a rating system for judging the Ethical standards of the “AI” features and apps you can use within Nextcloud. This is done from a developer perspective, and as long as you can control the software it runs on, use the software the model is trained with and curate the training data yourself, it gets a good rating.

This is sufficient effort from the developer’s point of view. When you consider hosting such things, it gets a little more complicated. If you simply offer a VM with some hardware geared towards running such software, you have to leave the selection of the models up to the admin or user. This is how most AI-as-a-service handles this problem. It leaves it up to the user to determine what models they find ethical and acceptable. If there were a clear contender for this, it would not be a problem. However, I have not found a model that clearly avoids copyright issues. Some people are working on this concept: https://huggingface.co/blog/Pclanglais/common-corpus

This proves that you do not have to accept copyright violations, a situation currently being framed as unavoidable by many “AI” companies. But this has not yet evolved into a usable model.

We do “managed hosting.” We’re involved with usage and workflow, so we share responsibility for this final step. Ideally, we would utilize a model trained with a public-domain dataset, but this work is currently incomplete and unusable.

 

 

What about the visual models?

 

It essentially contains the same issue. At the time of writing, several companies have trained models using Creative Commons or their own images, avoiding the copyright issue in the training data (Again showing that this can be done), but neither the model nor the software used to train the model nor the training data is available. This would still be negative regarding Nextcloud’s rating system, and they lack an API to connect.

 

The Good, the Bad, the Ethical.

 

We have an additional requirement for the Ethical rating: the model used must be trained on curated data to avoid any copyright infringement issues. It’s not sufficient to technically be able to gather, select, and curate data and train your own model. As a small hosting company, we do not have an ethics or AI department, which limits our ability to curate data and train models. Additionally, the usage must stay within reasonable CPU and memory limits to avoid a large price increase.

 
 

But what if we disagree?

 

 

We want to give you the information as we see it and clarify some of our decisions. We will host some “AI” features and software and not others, but if you want to connect to ChatGPT or a VPS running LocalAI, we’re happy to help you connect it up and apply the API keys to your environment.

Remember that using this can cause your data to be processed outside of your country or by a company in a different country, which can break the digital sovereignty of your data. Even if you, for instance, utilize a VPS to host LocalAI on Amazon or even DigitalOcean, these companies and their servers are subjected to the laws of the countries in which they are housed.

Let’s focus on the practical implications. There are several areas in Nextcloud where an LLM is used. I’ve provided the rating Nextcloud has given it and the one we would give it, as well as a status for the usage.

 

Text Generation

 

LocalAI

 

Currently, there is no model we could host or recommend that entirely avoids using copyrighted material.

  • Status: Tested, can be connected on request
  • Nextcloud rating: Green
  • TheGoodCloud: Yellow

 

OpenAI

 

This is ChatGPT 4, a controversial model known for containing copyrighted material. It’s likely no surprise to anyone reading this that it ticks none of the boxes. It works rather well, but it’s not as open as you’d expect from the name.

  • Status: Tested, API connection can be requested.
  • Nextcloud rating: Red
  • TheGoodCloud: Red

 

Images

 

Recognize app

 

Image, object and face recognition.

We do not enable this app by default, and we’re currently testing its functionality. It’s heavy on resources and likely will not function properly in our smaller consumer environments without us increasing the price for these environments to dedicate more CPU and Memory. The models come fully trained and do not incorporate any corrections or adjustments from Nextcloud itself. The training data for objects, faces, and actions is available, but information on any curating of the training data is lacking. The training data for the music genre recognition model is unavailable. Somehow, this still fetches a green, ethical rating from Nextcloud, but not so much from us.

  • Status: Testing, not ready. Creates a high load on small servers.
  • Nextcloud rating: Green
  • TheGoodCloud: Yellow (not curated to avoid copyright issues)
 

OpenAI

 

  • Dall-e image generation.
  • Status: Tested, works. External API, not open.
  • Nextcloud rating: Red
  • TheGoodCloud: Red
 

LocalAI

 

Uses a StableDiffusion model. These models cause a lot of debate as they are known to contain copyrighted images and artworks.

  • Status: Works, can be self-hosted and connected using an API key.
  • Nextcloud rating: Yellow
  • TheGoodCloud: Orange

 

Translations

 

Translate

This app utilizes the Opus models by the University of Helsinki. It is fully open-source. The data source was hard to find. However, the OPUS dataset compiles multilingual content with a free license to train a translation model, such as translated Wikipedia articles. There is limited diversity in the languages supported.

  • Status: Tested and can be requested.
  • Nextcloud rating: Green
  • TheGoodCloud: Green

 

LibreTranslate integration

 

Requires running Libretranslate server somewhere. https://github.com/LibreTranslate/LibreTranslate

It is indeed open source but must be hosted on a separate server. I have not found a mention of how the training data was curated and collected. Until I know where that came from, it doesn’t score as high as the translate app. If I do find it and it is indeed curated to avoid copyright issues. (which I do suspect is true), We can run it in our Kubernetes cluster and offer it as a paid add-on, but the translate app will likely suffice for most users.

  • Status: Testing/information incomplete.
  • Nextcloud rating: Green
  • TheGoodCloud: Yellow (needs more information)

 

Deepl Integration

 

Nothing is open source or available in the slightest. This is only for connecting the API. If you already use Deepl in your workflow, this can be handy, but if you’re looking for an ethical translation option, we’d recommend the Translate app.

  • Status: Connecting your Deepl account is available on request.
  • Nextcloud rating: Red
  • TheGoodCloud: Red

 

OpenAI

 

OpenAI has some very nice models and features, but none of the training data is open or actively curated to avoid copyright issues.

  • Status: Connection is tested. We can add the API key to the server if requested.
  • Nextcloud rating: Red
  • TheGoodCloud: Red

 

LocalAI

 

Currently, there is no model we could host or recommend that entirely avoids using copyrighted material.

  • Status: The connection is tested and can be connected with an API key.
  • Nextcloud rating: Green
  • TheGoodCloud: Yellow
In general:

We recommend the translate app. It’s local and open source, training data is available, and current models are already created using carefully curated data.

 

Speech-to-text options

 

This is not useful for dictation but can be used to generate a transcript for a presentation, for example.

 

Whisper Speech-To-Text app

 

The software is open-source, but the training data is not available.

  • Status: Testing/ slow
  • Nextcloud rating: Yellow
  • TheGoodCloud: Orange (training data is not available and not curated to avoid copyright issues)

 

Replicate app

 

  • Status: Works, external API
  • Nextcloud rating: Yellow
  • TheGoodCloud: Orange (training data not available and not curated to avoid copyright issues)
 
 

OpenAI

 

 

  • Status: Works, external API. We can add the API key for your OpenAI account to your environment on request.
  • Nextcloud rating: Yellow
  • TheGoodCloud: Orange (training data not available and not curated to avoid copyright issues)
 

LocalAI

 

The software to run LocalAI is open source and can be self-hosted. However, the model’s training data is not available. This requires a separate server and setup.

  • Status: The connection is tested and will work, but TheGoodCloud will not host LocalAI or offer it as an add-on. If requested, we will apply your API key from your own hosted instance of LocalAI to the Nextcloud server.
  • Nextcloud rating: Yellow
  • TheGoodCloud: Orange
In general:

All speech-to-text options for Nextcloud rely on OpenAI’s whisper models, which are not freely available or curated to avoid copyright issues.

 

Misc

 

Mail

 

It is a separate app we don’t enable by default, but it is often requested.

The model is created and trained on-premises based on the user’s own data. It prioritizes your mail. Data will need to be gathered from your usage before accurately anticipating your workflow. All of this is done locally, and so we’re happy to enable this for you.

  • Status: Tested and can be requested.
  • Nextcloud rating: Green
  • TheGoodCloud: Green

Suspicious login

The model is created and trained locally. It helps flag login attempts that might be an issue. This is ethically fine and already enabled in most environments.

  • Status: Tested and shipped.
  • Nextcloud rating: Green
  • TheGoodCloud: Green

Trained locally by usage. All software is open source. (Nextcloud)

  • Status: Tested and can be requested.
  • Nextcloud rating: Green
  • TheGoodCloud: Green

 

As someone relying on accessibility software, I am very excited about the development of Large Language Models and their advancements. And none of this is to make a judgment on who is utilizing what and why. I very much understand that some of these features can help a lot of users in a lot of ways, but let’s be honest; you would not be reading this if you were not curious about how we try to do Good while offering the “AI” features. If I have missed some information in this blog post or if I have inadvertently misinterpreted some things, please let me know.

The Good Journal #8 - It’s not easy being green, but it is undeniably important.

At The Good Cloud, we’ve always made sure the data centre where our servers are housed is driven by renewable energy. We presumed this was the most we could do to ensure our GoodClouds aren’t bad for the environment. However, our new friends at Leafcloud have shown us there are many more possibilities in regards to being sustainable.

 

While exploring viable options for our business backup services, we discovered Leafcloud. Their operations, like ours, are anchored in the Netherlands, within the guardrails of the GDPR.

 

Leafcloud takes sustainability to new heights. They do more than just efficiently manage the excess heat generated by their servers. Or, as Leafcloud co-founder David Kohnstamm puts it: “It is just a way of saying ‘throwing away something valuable'”

 

This thermal energy is redirected to heat various facilities like nursing homes, swimming pools, and even large residential complexes. What’s more, housing their servers inside these facilities not only allows for the vast majority of the server heat to significantly reduce the building’s footprint, it – perhaps more importantly – also means not building wasteful/energy-guzzling new data centers.

With the rapid progress in AI technology, the energy consumption of servers worldwide is quickly reaching new peaks. Initiatives like Leafcloud’s are much-needed to make sure that technological advancements don’t come at the expense of our planet’s well-being.

 

As more sophisticated and resource-demanding features are developed within Nextcloud and our GoodCloud environments, we are committed to maintaining sustainable practices. We see it as our responsibility to embrace technological growth in a way that doesn’t compromise the health of our planet.

 

In this age of rapid digital transformation, it’s vital that we recognize our proverbial ‘clouds’ have a physical, environmental footprint. And we must do all we can to minimize the impactWe might still be some years away from Kermit singing a new song but, happily, being green just got a little easier.

The Good Journal #7 - It is your data, not their dataset

Whether you write, paint, sing, draw, design, calculate, report, present or play the theremin, it’s very likely that you’ll have files related to these pursuits, and these files are your property.

 

At The Good Cloud, such files are securely stored and available to you. But we don’t use your files for anything else. Your work will never become part of any kind of model while stored with us. Which ensures that your work will not be easily recreated by a AI. This will protect you from copyright infringements by others. Such as some authors and journalists are now facing.

 

In recent times, the fine print of cloud services has begun to incorporate the potential use of your data in large language models and visual models. For instance, if you use Adobe Creative Cloud, you should check your settings. If you didn’t opt out of the “content analysis,” you would have given them permission to use your work in their “techniques, such as machine learning.”

 

In December, Dropbox teamed up with OpenAI. They claim that data is shared only if users activate a specific feature and in such cases we can only trust that this means the same thing as we our selves envision.

 

The update to Google’s privacy policy in 2023 had a similar impact. They now reserve the right to scrape your data and behavior. While this might be limited to publicly available data, we again have to trust in the mutual understanding of what we mean by that. With Microsoft’s partnership with OpenAI, your data may have very few safe havens. Even if we can trust in this mutual understanding, I urge you all to consider with whom you store your data. Maintain control of your ownership.

 

This is not to join the general outrage parade towards AI itself. Honestly, I consider the entire debate to be overly focused on a symptom rather than a cause. The rush to offer AI is driven by overwhelming financial incentives leading to erroneous shortcuts, and as with all new emerging tech, misuse is initially rampant.

 

Personally, I see a lot of potential in the correct methods of feeding a dataset or model, and the informed utilization of the resulting tools. That being said. Participation should be and should have been, opt in instead of opt out.

 

Within our service and the Nextcloud software we utilize, you will be finding more and more options that offer ways of integrating certain AI services such as ChatGPT or LocalAI. This will always remain opt in and would only result in the tool being available with in the service. It may help you correct that report you’re writing but it will never scan or use the poem you’ve stored.

 

And, crucially, you don’t just have to trust our word on that. You can always jump into the Nextcloud community and have a good look at the code: https://github.com/nextcloud

 

So, will we be offering AI as a service?

 

No, while we have initially considered it, we have decided not to offer a service like LocalAI hosted on our platform. We have not found a large language model that does not have a dubious origin story, nor have we found one that would not, on occasion, proclaim something dangerous or downright wrong. We will, how ever, be on hand to help you figure out how to connect what you would like to connect.

 

Take care of what you create. It’s precious.

 

Image: EasyDiffusion SD5

Text spelling: OnlyOffice autocorrect

Text copy editing: OpenAI’s chatGPT 4 and Grammarly 

The Good Journal #6 - The Importance of Digital Sovereignty

Imagine this scenario: without a second thought, you upload an important report to the usual big tech cloud provider service we’ve all used at times. It’s omnipresent and often pre-installed on your device—it’s been your go-to service for sharing business files. But today, as you tap the upload button to send it to “the cloud,” your eyes wander out of your office window. At the same moment you upload your report, a strong wind carries a little cloud in the sky away at a remarkable speed. Suddenly, you wonder about the journey your report is starting on.

 

The truth is, our clouds, and our data, are whisked away by the winds and end up hovering over distant countries with different laws and views on privacy. This casual relinquishing of control allows our data to be governed by other laws, other companies, and other values. Our data has been ensnared with convenient services and less pleasant actions, often without our conscious awareness.

 

Regardless of whether you approve of the laws and companies where your little report cloud now hovers, relinquishing control of our data in this way is inherently problematic. While we strive for clearer and updated privacy laws in Europe, those laws would fail to impact what is happening to your data if its cloud has been blown to the other side of the world. There are very clear differences in data privacy laws throughout the world, and your data might be kept, scanned, used, traded, sold, or even printed out and folded into origami, all without you ever knowing.

 

At TheGoodCloud, we keep your data in the Netherlands. It only ventures out on your own devices if you’ve synced it yourself. By default, it all stays in the Netherlands. Using our services, your little report cloud stays snugly above Amsterdam.

Resulting in benefits from laws such as the General Data Protection Regulation (GDPR). This is the most comprehensive data protection law globally, with a strong emphasis on user rights such as the right of access, right to rectification, right to be forgotten, and data portability. GDPR also mandates data processors to maintain higher standards of data security.

 

Then there’s the Network and Information Security Directive (NISD): the first piece of EU-wide cybersecurity legislation. The NISD sets another layer of security requirements for businesses operating in critical sectors and digital service providers.

 

In contrast, some foreign laws include China’s cybersecurity law (CSL), which increases the government’s access to company data and creates uncertainty with ambiguous language, broad, and vague provisions. It requires the data of services offered by Chinese companies to be stored on servers within China, potentially exposing it to state surveillance.

 

The US, until recently, relied mostly on self-regulation and doesn’t have a comprehensive federal-level privacy law comparable to the GDPR. The CLOUD Act allows federal law enforcement to compel U.S.-based technology companies via warrant or subpoena to provide requested data stored on servers, regardless of whether the data is stored in the U.S. or on foreign soil. So even when a US-based company assures you that your data is securely stored in the EU, their company will still have to comply with US laws, potentially poking and prodding your poor little lost report cloud.

 

With TheGood.Cloud, data sovereignty isn’t just an option—it’s a guarantee. We are a Dutch company and host your files on our own servers in the Netherlands.

Ready to exercise more control over your business data? Discover the difference with TheGood.Cloud. Bring that little report cloud home!

The Good Journal #5 Cloudy with a Chance of Perfect Privacy

Hello everyone!

 

Greetings from the heart of our vibrant community. The untold stories and unwavering support you never fail to render keep igniting our spirits. Today we’re over the moon to share the infectious energy bubbling here. Our recent “Invent a Slogan” venture was a smashing success, with all of you brilliant minds stepping up to breathe life into our cause.

 

Navigating the ocean of creative entries we received was quite an adventure, to say the least. Moments were filled with hearty laughter, thoughtful pauses, awe and wonder at the sheer genius displayed. The variety and originality highlighted the many facets of our shared mission.

 

Our hats are off to Matteo, Luna, and Mike who presented standout slogans, capturing the essence of our ethos with charm and originality. The thought and consideration behind each entry have left an indelible mark, inspiring us to aim higher and dig deeper. 

 

After an animated session of brainstorming and fervent discussions, with more than a couple of friendly squabbles, we eventually, unanimously, landed on our new slogan – “Cloudy with a Chance of Perfect Privacy”. It’s cheeky. It’s thought-provoking. It’s Us.

 

The cloud metaphor signifies the unpredictable and erratic nature of the Internet – a space often opaque and elusive. Yet, within this nebulous expanse, we are constantly on the lookout for that silver lining – a chance at perfect privacy. This notion drives us, fuels our passion to ensure every netizen can traverse the digital landscape confidently, assured of their privacy.

 

To honour this milestone, we are sending out a truly special T-shirt emblazoned with the winning slogan to its creative architect! Stay tuned for that joyous mail delivery. 

We are thrilled to announce that over the next few months, our new slogan will slowly pervade our website and service theming. It’s not just a catchphrase; it’s an affirmation of what we stand for – a world where privacy is not just a possibility, but an unwavering guarantee.

 

Once again, a heart-felt thank you to each one of you who joined in this revealing exercise of creativity and community spirit. Remember, even though the path forward may seem clouded, together, we can discover the chance of perfect privacy.

 

Exciting contests, engaging activities, and pleasing surprises are queued up for you in the coming times. So, stay tuned, stay engaged, and until next time, happy browsing!