More

bfirsh · 2025-10-14T14:13:04 1760451184

Whenever I read about it, I am surprised at the complexity of iOS security. At the hardware level, kernel level, all the various types of sandboxing.

Is this duct tape over historical architectural decisions that assumed trust? Could we design something with less complexity if we designed it from scratch? Are there any operating systems that are designed this way?

KerrAvon · 2025-10-14T16:45:08 1760460308

>Is this duct tape over historical architectural decisions that assumed trust?

Yes, it's all making up for flaws in the original Unix security model and the hardware design that C-based system programming encourages.

> Could we design something with less complexity if we designed it from scratch? Are there any operating systems that are designed this way?

Yes, capability architecture, and yes, they exist, but only as academic/hobby exercises so far as I've seen. The big problem is that POSIX requires the Unix model, so if you want to have a fundamentally different model, you lose a lot of software immediately without a POSIX compatibility shim layer -- within which you would still have said problems. It's not that it can't be done, it's just really hard for everyone to walk away from pretty much every existing Unix program.

fragmede · 2025-10-14T15:13:41 1760454821

> seL4 is a fast, secure and formally verified microkernel with fine-grained access control and support for virtual machines.

https://medium.com/@tunacici7/sel4-microkernel-architecture-...

It's missing "the rest of the owl", so to speak, so it's a bit of a stretch to call it an operating system for anything more than research.

Citizen8396 · 2025-10-14T15:26:48 1760455608

Vulnerabilities are inevitable, especially if you want to support broad use cases on a platform. Defense-in-depth is how you respond to this.

MBCook · 2025-10-14T15:11:00 1760454660

iOS is based on MacOS is based on NeXT is a Unix.

It’s been designed with lower user trust since day one, unlike other OSes of the era (consumer Windows, Mac’s classic OS).

Just how much you can trust the user has changed overtime. And of course the device has picked up a lot of a lot of of capabilities and new threats such as always on networking in various forms and the fun of a post Spectre world.

kangs · 2025-10-14T15:02:02 1760454122

why not do both :)

I think that there's also inherent trust in "hardware security" but as we all know its all just hardcoded software at the end of the day, and complexity will bring bugs more frequently.

kmeisthax · 2025-10-14T20:40:25 1760474425

Yes, but they're architectural decisions made at Bell Labs in the 70s. iOS was always designed with the assumption that no one is trustworthy[0], not even the owner of the device. So there is a huge mismatch between "70s timesharing OS" and "phone that doesn't believe you when you say 'please run this code'" That being said, most of these security features are not duct-tape over UNIXisms that don't fit Apple's walled garden nonsense. To be clear, iOS has the duct-tape, too, but all that lives in XNU (the normal OS kernel).

SPTM exists to fix a more fundamental problem with OS security: who watches the watchers? Regular processes have their memory accesses constrained by the kernel, but what keeps the kernel from unconstraining itself? The answer is to take the part of the kernel responsible for memory management out of the kernel and put it in some other, higher layer of privilege.

SPRR and GLs are hardware features that exist solely to support SPTM. If you didn't have those, you'd probably need to use ARM EL2 (hypervisor) or EL3 (TrustZone secure monitor / firmware), and also put code signing in the same privilege ring as memory access. You might recognize that as the design of the Xbox 360 hypervisor, which used PowerPC's virtualization capability to get a higher level of privilege than kernel-mode code.

If you want a relatively modern OS that is built to lock out the user from the ground-up, I'd point you to the Nintendo 3DS[1], whose OS (if not the whole system) was codenamed "Horizon". Horizon had a microkernel design where a good chunk of the system was moved to (semi-privileged) user-mode daemons (aka "services"). The Horizon kernel only does three things: time slicing, page table management, and IPC. Even security sensitive stuff like process creation and code signing is handled by services, not the kernel. System permissions are determined by what services you can communicate with, as enforced by an IPC broker that decides whether or not you get certain service ports.

The design of Horizon would have been difficult to crack, if it wasn't for Nintendo making some really bad implementation decisions that made it harder for them to patch bugs. Notably, you could GPU DMA onto the Home Menu's text section and run code that way, and it took Nintendo years to actually move the Home Menu out of the way of GPU DMA. They also attempted to resecure the system with a new bootloader that actually compromised boot chain security and let us run custom FIRMs (e.g. GodMode9) instead of just attacking the application processor kernel. But the underlying idea - separate out the security-relevant stuff from the rest of the system - is really solid, which is why Nintendo is still using the Horizon design (though probably not the implementation) all the way up to the Switch 2.

[0] In practice, Apple has to be trustworthy. Because if you can't trust the person writing the code, why run it?

[1] https://www.reddit.com/r/3dshacks/comments/6iclr8/a_technica...

encom · 2025-10-14T14:43:03 1760452983

Security in this context means the intruder is you, and Apple is securing their device so you can't run code on it, without asking Apple for permission first.

astrange · 2025-10-14T19:48:26 1760471306

That makes no sense for a phone because you go outside with it in your pocket, leave it places, connect to a zillion kinds of networks with it, etc. It's not a PC in an airgapped room. It is very easy for the user of the device to be someone who isn't you.

thewebguyd · 2025-10-14T15:50:57 1760457057

It can be both.

Any sufficiently secure system is, by design, also secure against it's primary user. In the business world this takes the form of protecting the business from its own employees in addition to outside threats.

bfirsh · 2025-10-02T19:37:57 1759433877

Free usage usually goes in sales and marketing. It's effectively a cost of acquiring a customer. This also means it is considered an operating expense rather than a cost of goods sold and doesn't impact your gross margin.

Compute in R&D will be only training and development. Compute for inference will go under COGS. COGS is not reported here but can probably be, um, inferred by filling in the gaps on the income statement.

(Source: I run an inference company.)

singron · 2025-10-03T04:01:26 1759464086

I think it makes the most sense this way, but I've seen it accounted for in other ways. E.g. if free users produce usage data that's valuable for R&D, then they could allocate a portion of the costs there.

Also, if the costs are split, there usually has to be an estimation of how to allocate expenses. E.g. if you lease a datacenter that's used for training as well as paid and free inference, then you have to decide a percentage to put in COGS, S&M, and R&D, and there is room to juice the numbers a little. Public companies are usually much more particular about tracking this, but private companies might use a proxy like % of users that are paid.

OpenAI has not been forthcoming about their financials, so I'd look at any ambiguity with skepticism. If it looked good, they would say it.

bfirsh · 2025-05-29T23:08:18 1748560098

Founder of Replicate here. We should be on par or faster for all the top models. e.g. we have the fastest FLUX[dev]: https://artificialanalysis.ai/text-to-image/model-family/flu...

If something's not as fast let me know and we can fix it. ben@replicate.com

echelon · 2025-05-29T23:18:24 1748560704

Hey Ben, thanks for participating in this thread. And certainly also for all you and your team have built.

Totally frank and possibly awkward question, you don't have to answer: how do you feel about a16z investing in everyone in this space?

They invested in you.

They're investing in your direct competitors (Fal, et al.)

They're picking your downmarket and upmarket (Krea, et al.)

They're picking consumer (Viggle, et al.), which could lift away the value.

They're picking the foundation models you consume. (Black Forest Labs, Hedra, et al.)

They're even picking the actual consumers themselves. (Promise, et al.)

They're doing this at Series A and beyond.

Do you think they'll try to encourage dog-fooding or consolidation?

The reason I ask is because I'm building adjacent or at a tangent to some of this, and I wonder if a16z is "all full up" or competitive within the portfolio. (If you can answer in private, my email is [my username] at gmail, and I'd be incredibly grateful to hear your thoughts.)

Beyond that, how are you feeling? This is a whirlwind of a sector to be in. There's a new model every week it seems.

Kudos on keeping up the pace! Keep at it!

mac-mc · 2025-05-30T17:47:35 1748627255

That feels like the VC equivalent of buying a market-specific fund, so fairly par for the course?

bfirsh · 2025-01-03T16:48:30 1735922910

Replicate (YC W20) | San Francisco, CA + Remote | https://replicate.com/

Replicate makes it easy to run AI in the cloud. You can run a big library of open source models with a few lines of code, or deploy your own models at scale.

We're an experienced team from Spotify, Docker, GitHub, Heroku, NVIDIA, and various other places. We're backed by a16z, Sequoia, NVIDIA, Andrej Karpathy, Dylan Field, Guillermo Rauch.

We're hiring:

- An infrastructure engineer

- A machine learning engineer who's an expert at image models

- An engineer who likes talking to people to look after our customers

... and more: https://replicate.com/about

bfirsh · on Oct 14, 2024

You're right -- this wasn't clear. Added another paragraph to explain what you had to do before.

bfirsh · on Oct 7, 2024

I can confirm these are dangerous. There are several of these in Berkeley and I got knocked off my bicycle on one of them for exactly the reason you describe.

I am from the UK and it makes me wonder why road design in the US is so bad. Just one minute of thinking about this as a lay person would reveal the problem with the design.

Is there some structural reason in the US that would cause it? Perhaps some lack of standards or approval process? Perhaps iteration speed is slower so they don’t get better? Some other incentives going on?

pclmulqdq · on Oct 7, 2024

My personal hypothesis on this is that the worst 5% of Americans is likely both dumber and more sociopathic than Europeans, and the behavior of the worst drivers is what creates a lot of traffic and road accidents. If that is the case, you will not have the same kind of design that works in a high-trust, more cohesive society.

bfirsh · on Aug 29, 2024

https://replicate.com/yorickvp/llava-13b :)

bfirsh · on June 3, 2024

Replicate (YC W20) | San Francisco, CA + Remote | https://replicate.com/

Replicate makes it easy to run AI in the cloud. You can run a big library of open source models with a few lines of code, or deploy your own models at scale.

We're an experienced team from Spotify, Docker, GitHub, Heroku, Apple, and various other places. We're backed by a16z, Sequoia, Andrej Karpathy, Dylan Field, Guillermo Rauch.

We're hiring:

- An infrastructure engineer

- An expert at deploying and optimizing language models

- An engineer who is good at humans to look after our customers

... and more: https://replicate.com/about#join-us

Email us: jobs@replicate.com

bfirsh · on April 24, 2024

If you want to have a conversation with it, here's a full chat app: https://arctic.streamlit.app/

Official blog post: https://www.snowflake.com/blog/arctic-open-efficient-foundat...

Weights: https://huggingface.co/Snowflake/snowflake-arctic-instruct

leblancfg · on April 24, 2024

Wow that is *so fast*, and from a little testing writes both rather decent prose and Python.

pixelesque · on April 24, 2024

I guess the chat app is under quite a bit of load?

I keep getting error traceback "responses" like this:

TypeError: This app has encountered an error. The original error message is redacted to prevent data leaks. Full error details have been recorded in the logs (if you're on Streamlit Cloud, click on 'Manage app' in the lower right of your app). Traceback:

File "/home/adminuser/venv/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 584, in _run_script exec(code, module.__dict__) File "/mount/src/snowflake-arctic-st-demo/streamlit_app.py", line 101, in <module> full_response = st.write_stream(response)

hnthrowaway9812 · on April 24, 2024

It claims to have a knowledge cut-off of 2021. Not sure if its hallucinating or its true.

But when I asked it about the best LLMs it suggested GPT-3, Bert and T5!

bfirsh · on April 24, 2024

It might not be obvious from the title, but this model is absolutely massive: 480B parameters. The largest open-source model to date, I believe.

You can try it out here: https://arctic.streamlit.app/

Weights are here: https://huggingface.co/Snowflake/snowflake-arctic-instruct

georgehill · on April 24, 2024

I can still change the title, but I'm not sure if that would be okay according to HN guidelines.

bfirsh · on April 24, 2024

Indeed – I was doing the editorialization in the comments. ;)

It's missing the name though (Arctic). That might be worth adding.

georgehill · on April 24, 2024

Done!