We involve an inefficient reference PyTorch implementation in gpt_oss/torch/design.py. This code works by using primary PyTorch operators to show the exact model architecture, with a little addition of supporting tensor parallelism in MoE so the more substantial product can run using this code (e.I do want to know what it explained to you the metho